Advanced Model Implementation to Recognize Emotion Based Speech with Machine Learning
Keywords:
Multi-Layer Perceptron, Speech Emotion Recognition, NLP, Mel-frequency cepstrum coefficients, modulation spectral featuresAbstract
Emotions are essential in developing interpersonal relationships. Emotions make emphasizing with others’ problems easy and leads to better communication without misunderstandings. Humans possess the natural ability of understanding others’ emotions from their speech, hand gestures, facial expressions etc and react accordingly but, it is impossible for machines to extract and understand emotions unless they are trained to do so. Speech Emotion Recognition is one step towards it, SER uses ML algorithms to forecast the emotion behind a speech. The features which include MEL, MFCC, and Chroma of a set of audio parts are extracted using python libraries and are used to build the ML model. An MLP (Multi-Layer Perceptron) is used which will be mapping the features along with the sound file and predicts the emotion. The project details more about the development and deployment of the model. A technique known as "Speech Emotion Recognition" could identify emotional characteristics in speech signals by computer and contrasts and analysis the characteristics parameters and the emotional change acquired. In current market, speech emotion recognition was emerging crossing field of artificial Intelligence.
Downloads
References
Geoffrey Z, Picheny M (2004) Advances in large vocabulary continuous speech recognition. Adv Comput 60:249– 291CrossRefGoogle Scholar
Campbell N (2007) On the use of nonverbal speech sounds in human communication. In: Campbell N (ed) Verbal and nonverbal communication behaviours LNAI, vol 4775. Springer, New York, pp 117–128CrossRefGoogle Scholar
Laver J (1980) The phonetic description of voice quality. Cambridge University Press, CambridgeGoogle Scholar [4] Roach P, Stibbard R, Osborne J, Arnfield S, Setter J (1998)
Transcription of prosodic and paralinguistic features of emotional speech. J Int Phonetic Assoc 28(1–2):83– 94CrossRefGoogle Scholar
Crystal D (1969) Prosodic systems and intonation in English: David Crystal. Cambridge University Press, CambridgeGoogle Scholar
Carlson R (2002) Dialogue system. Slide presentation, speech technology, GSLT, Göteborg, Oct 2002
Rolf C, Granström B (1997) Speech synthesis. In: Hardcastle WJ, Laver J (eds) The handbook of phonetic sciences. Blackwell Publishers Ltd, Oxford, pp 768–788Google Scholar