Advanced Model Implementation to Recognize Emotion Based  Speech with Machine Learning

. Kanakam Siva Rama Prasad; N Srinivasa Rao; B Sravani

Authors

. Kanakam Siva Rama Prasad Professor & Head, Department of Artifical Intelligence & Data Science, Pace Institute of Technology and Sciences, Ongole, Andhra Pradesh, India Author
N Srinivasa Rao Associate Professor, Department of Artifical Intelligence & Data Science, Pace Institute of Technology and Sciences, Ongole, Andhra Pradesh, India Author
B Sravani Assistant Professor, Department of Artifical Intelligence & Data Science, Pace Institute of Technology and Sciences, Ongole, Andhra Pradesh, India Author

Keywords:

Multi-Layer Perceptron, Speech Emotion Recognition, NLP, Mel-frequency cepstrum coefficients, modulation spectral features

Abstract

Emotions are essential in developing interpersonal relationships. Emotions make emphasizing with others’ problems easy and leads to better communication without misunderstandings. Humans possess the natural ability of understanding others’ emotions from their speech, hand gestures, facial expressions etc and react accordingly but, it is impossible for machines to extract and understand emotions unless they are trained to do so. Speech Emotion Recognition is one step towards it, SER uses ML algorithms to forecast the emotion behind a speech. The features which include MEL, MFCC, and Chroma of a set of audio parts are extracted using python libraries and are used to build the ML model. An MLP (Multi-Layer Perceptron) is used which will be mapping the features along with the sound file and predicts the emotion. The project details more about the development and deployment of the model. A technique known as "Speech Emotion Recognition" could identify emotional characteristics in speech signals by computer and contrasts and analysis the characteristics parameters and the emotional change acquired. In current market, speech emotion recognition was emerging crossing field of artificial Intelligence.

Downloads

Download data is not yet available.

References

Geoffrey Z, Picheny M (2004) Advances in large vocabulary continuous speech recognition. Adv Comput 60:249– 291CrossRefGoogle Scholar

Campbell N (2007) On the use of nonverbal speech sounds in human communication. In: Campbell N (ed) Verbal and nonverbal communication behaviours LNAI, vol 4775. Springer, New York, pp 117–128CrossRefGoogle Scholar

Laver J (1980) The phonetic description of voice quality. Cambridge University Press, CambridgeGoogle Scholar [4] Roach P, Stibbard R, Osborne J, Arnfield S, Setter J (1998)

Transcription of prosodic and paralinguistic features of emotional speech. J Int Phonetic Assoc 28(1–2):83– 94CrossRefGoogle Scholar

Crystal D (1969) Prosodic systems and intonation in English: David Crystal. Cambridge University Press, CambridgeGoogle Scholar

Carlson R (2002) Dialogue system. Slide presentation, speech technology, GSLT, Göteborg, Oct 2002

Rolf C, Granström B (1997) Speech synthesis. In: Hardcastle WJ, Laver J (eds) The handbook of phonetic sciences. Blackwell Publishers Ltd, Oxford, pp 768–788Google Scholar

Advanced Model Implementation to Recognize Emotion Based Speech with Machine Learning

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

How to Cite

International Journal of Innovative Research in Engineering & Management

Advanced Model Implementation to Recognize Emotion Based Speech with Machine Learning

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

How to Cite

International Journal of Innovative Research in Engineering & Management

subscribe-us-for-latest-update