首页> 中文期刊> 《计算机、材料和连续体(英文)》 >The Efficacy of Deep Learning-Based Mixed Model for Speech Emotion Recognition

The Efficacy of Deep Learning-Based Mixed Model for Speech Emotion Recognition

         

摘要

Human speech indirectly represents the mental state or emotion of others.The use of Artificial Intelligence(AI)-based techniques may bring revolution in this modern era by recognizing emotion from speech.In this study,we introduced a robust method for emotion recognition from human speech using a well-performed preprocessing technique together with the deep learning-based mixed model consisting of Long Short-Term Memory(LSTM)and Convolutional Neural Network(CNN).About 2800 audio files were extracted from the Toronto emotional speech set(TESS)database for this study.A high pass and Savitzky Golay Filter have been used to obtain noise-free as well as smooth audio data.A total of seven types of emotions;Angry,Disgust,Fear,Happy,Neutral,Pleasant-surprise,and Sad were used in this study.Energy,Fundamental frequency,and Mel Frequency Cepstral Coefficient(MFCC)have been used to extract the emotion features,and these features resulted in 97.5%accuracy in the mixed LSTM+CNN model.This mixed model is found to be performed better than the usual state-of-the-art models in emotion recognition from speech.It also indicates that this mixed model could be effectively utilized in advanced research dealing with sound processing.

著录项

相似文献

  • 中文文献
  • 外文文献
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号