Advanced Model Implementation to Recognize Emotion Based Speech with Machine Learning

Authors

  • . Kanakam Siva Rama Prasad Professor & Head, Department of Artifical Intelligence & Data Science, Pace Institute of Technology and Sciences, Ongole, Andhra Pradesh, India Author
  • N Srinivasa Rao Associate Professor, Department of Artifical Intelligence & Data Science, Pace Institute of Technology and Sciences, Ongole, Andhra Pradesh, India Author
  • B Sravani Assistant Professor, Department of Artifical Intelligence & Data Science, Pace Institute of Technology and Sciences, Ongole, Andhra Pradesh, India Author

Keywords:

Multi-Layer Perceptron, Speech Emotion Recognition, NLP, Mel-frequency cepstrum coefficients, modulation spectral features

Abstract

Emotions are essential in developing  interpersonal relationships. Emotions make emphasizing  with others’ problems easy and leads to better  communication without misunderstandings. Humans  possess the natural ability of understanding others’  emotions from their speech, hand gestures, facial  expressions etc and react accordingly but, it is impossible  for machines to extract and understand emotions unless  they are trained to do so. Speech Emotion Recognition is  one step towards it, SER uses ML algorithms to forecast  the emotion behind a speech. The features which include  MEL, MFCC, and Chroma of a set of audio parts are  extracted using python libraries and are used to build the  ML model. An MLP (Multi-Layer Perceptron) is used  which will be mapping the features along with the sound  file and predicts the emotion. The project details more  about the development and deployment of the model. A  technique known as "Speech Emotion Recognition" could  identify emotional characteristics in speech signals by  computer and contrasts and analysis the characteristics  parameters and the emotional change acquired. In current  market, speech emotion recognition was emerging  crossing field of artificial Intelligence. 

Downloads

Download data is not yet available.

References

Geoffrey Z, Picheny M (2004) Advances in large vocabulary continuous speech recognition. Adv Comput 60:249– 291CrossRefGoogle Scholar

Campbell N (2007) On the use of nonverbal speech sounds in human communication. In: Campbell N (ed) Verbal and nonverbal communication behaviours LNAI, vol 4775. Springer, New York, pp 117–128CrossRefGoogle Scholar

Laver J (1980) The phonetic description of voice quality. Cambridge University Press, CambridgeGoogle Scholar [4] Roach P, Stibbard R, Osborne J, Arnfield S, Setter J (1998)

Transcription of prosodic and paralinguistic features of emotional speech. J Int Phonetic Assoc 28(1–2):83– 94CrossRefGoogle Scholar

Crystal D (1969) Prosodic systems and intonation in English: David Crystal. Cambridge University Press, CambridgeGoogle Scholar

Carlson R (2002) Dialogue system. Slide presentation, speech technology, GSLT, Göteborg, Oct 2002

Rolf C, Granström B (1997) Speech synthesis. In: Hardcastle WJ, Laver J (eds) The handbook of phonetic sciences. Blackwell Publishers Ltd, Oxford, pp 768–788Google Scholar

Downloads

Published

2022-12-30

How to Cite

Advanced Model Implementation to Recognize Emotion Based Speech with Machine Learning . (2022). International Journal of Innovative Research in Engineering & Management, 9(6), 47–54. Retrieved from https://acspublisher.com/journals/index.php/ijirem/article/view/10666