Speech Emotion Recognition Based on Fusion Method
Subject Areas : Speech ProcessingSara Motamed 1 , Saeed Setayeshi 2 * , Azam Rabiee 3 , Arash Sharifi 4
1 - Islamic Azad University Fooman Branch
2 - Amirkabir University
3 - Islamic Azad University Isfahan
4 - Islamic Azad University Isfahan
Keywords: Speech Emotion Recognition , Mel Frequency Cepstral Coefficient (MFCC) , Fixed and Variable Structures Stochastic Automata , Multi-constraint, Fusion Method,
Abstract :
Speech emotion signals are the quickest and most neutral method in individuals’ relationships, leading researchers to develop speech emotion signal as a quick and efficient technique to communicate between man and machine. This paper introduces a new classification method using multi-constraints partitioning approach on emotional speech signals. To classify the rate of speech emotion signals, the features vectors are extracted using Mel frequency Cepstrum coefficient (MFCC) and auto correlation function coefficient (ACFC) and a combination of these two models. This study found the way that features’ number and fusion method can impress in the rate of emotional speech recognition. The proposed model has been compared with MLP model of recognition. Results revealed that the proposed algorithm has a powerful capability to identify and explore human emotion.
1. R´azuri, J.G., et al., Speech emotion recognition in emotional feedback for human - robot interaction International Journal of Advanced Research in Artificial Intelligence, 2015. 4: p. 20-27.#
2. Ayadi, M.E., M.S. Kamel, and F. Karray, Survey on speech emotion recognition: features, classification schemes and databases. Pattern Recognition, 2011: p. 572- 587.#
3. Cowie, R., et al., Emotion recognition in human-computer interaction. IEEE signal Processing, 2001: p. 32-80.#
4. Seehapoch, T. and S. Wongathanavasu, Speech emotion recognition using support vector machines. 5th International Conference on Knowledge and Smart Technology, 2012: p. 621 - 625.#
5. Ververidis, D. and C. Kotropoulos, Emotional speech recognition: resources, features and methods. Elsevier Speech communication, 2006. 48(9): p. 1162- 1181.#
6. Hozjan, V. and Z. Kacic, Context-independent multilingual emotion recognition from speech signal. Int. J. Speech Technol, 2003. 6: p. 311-320.#
7. cahn, J., The generation of affect in synthesized speech. Voice Input/ Output Soc, 1990: p. 1-19.#
8. Zhang, Q., et al., Speech Emotion Recognition using Combination of Features. Forth International Conference on Intelligent control and Information Processing (ICICIP), 2013.#
9. Bojanic, M., V. Crnojevic, and V. Deliv, Application of neural network in emotional speech recognition. 2012: p. 20-22.#
10. Price, J., Design an automatic speech recognition system using Matlab. University of Maryland Estern Shore Princess Anne, 2005: p. 100-106.#
11. Adell Mercado, J., A. Bonafonte Cávez, and D. Escudero Mancebo, Analysis of prosodic features: towards modelling of emotional and pragmatic attributes of s
peech. SEPLN, 2005: p. 277- 284.#
12. Wu, D., T.D. Parsons, and S.S. Narayanan, Acoustic feature analysis in speech emotion primitives estimation. Interspeech, 2010: p. 785-788.#
13. Lotfi, E., Mathematical modeling of emotional brain for classification problems. Proceedings of IAM, 2013. 2(1): p. 60- 71.#
14. Muda, L., M. Begam, and Elamvazuthi., Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW Techniques). JOURNAL OF COMPUTING, 2010. 2(3): p. 138-143.#
15. Thathachar, M.A.L. and P.S. Sastry, Varieties of learning automata. An Overview in IEEE Transaction on System, 2002. 32(6): p. 711-722.#
16. Narendra, K.S. and M.A.L. Thathachar, Learning automata. An Introduction in Prentice Hall, 1974.#
17. Narendra, K.S. and M.A.L. Thathachar, Learning automata - A survey. IEEE Transactions on Systems, Man and Cybernetics, 1974. 4(4): p. 323-334.#
18. Horn, G. and B.J. Oommen, Solving Multiconstraint Assignment Problems Using Learning Automata. IEEE transactions on system, man, and cybernetics- part, 2002.#
19. Eyben, F., M. Wöllmer, and B. Schuller, Opensmile: the munich versatile and fast open source audio feature extractor. 10 Proceedings of the International Conference on Multimedia, 2010: p. 1459-1462.#
20. Harimi, A., et al., Classification of emotional speech spectral pattern features. Journal of AI and Data Mining, 2014. 2(1): p. 53-61.#
21. Burkhardt, F., et al., A database of German emotional speech. Interspeech, 2005. 5: p. 1517-1520.#