Feature Extraction and Processing Analysis in Speech Recognition
DOI:
https://doi.org/10.55524/Keywords:
Feature Extraction, Processing, Sinusoidal Model, Speech recognition, StressedAbstract
The difficulties with automated identification and synthesis of various speech patterns have become significant research issues in recent years. Stress induced speech characteristics were compared to normal speech in a feature analysis. Due to stress, the performance of Stressed speech recognition decreases substantially. In the speech communication system, the voice signal is transmitted, stored, and processed in a variety of ways. The speech signal must be delivered in such a way that the information content may be easily extracted from human listeners or machine automation. To enhance speech recognition performance, a stressed compensation method is employed to compensate for stress distortion. To identify different moods in speech signals, these features are collected and assessed in English. The variations in glottal excitement of common speaking patterns are examined in depth in this article. The sinusoidal model effectively describes the different stress classes in a speech signal, according to the results. When it comes to detecting emotions in a pressured speaker, sinusoidal features outperform linear prediction features.
Downloads
References
. Rahurkar MA, Hansen JHL, Meyerhoff J, Saviolakis G, Koenig M. Frequency band analysis for stress detection using a teager energy operator based feature. In: 7th International Conference on Spoken Language Processing, ICSLP 2002. 2002.
. Wang Y. Speech recognition under stress. ProQuest Dissertations and Theses. 2009.
. Keller E. The analysis of voice quality in speech processing. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2005.
. Panda SP, Nayak AK. An efficient model for text-to speech synthesis in Indian languages. Int J Speech Technol. 2015;
. Waghmare K, Kayte S, Gawali B. Analysis of Pitch and Duration in Speech Synthesis using PSOLA. Commun Appl Electron. 2016;
. Panda SP, Nayak AK. Automatic speech segmentation in syllable centric speech recognition system. Int J Speech Technol. 2016;
. Ghahramani Z. An introduction to hidden Markov models and Bayesian networks. Int J Pattern Recognit Artif Intell. 2001;
. Mohanty MN, Jena B. Analysis of stressed human speech. Int J Comput Vis Robot. 2011;
. Palo HK, Mohanty MN, Chandra M. Design of neural network model for emotional speech recognition. In: Advances in Intelligent Systems and Computing. 2015.
. Palo HK, Mohanty MN, Chandra M. Efficient feature combination techniques for emotional speech classification. Int J Speech Technol. 2016;