首页> 外文会议>International Conference on Signal Processing and Communication Systems >Optimized multi-channel deep neural network with 2D graphical representation of acoustic speech features for emotion recognition
【24h】

Optimized multi-channel deep neural network with 2D graphical representation of acoustic speech features for emotion recognition

机译:优化的多通道深度神经网络,具有用于语音识别的语音特征二维图形表示

获取原文

摘要

This study investigates the effectiveness of speech emotion recognition using a new approach called the Optimized Multi-Channel Deep Neural Network (OMC-DNN), The proposed method has been tested with input features given as simple 2D black and white images representing graphs of the MFCC coefficients or the TEO parameters calculated either from speech (MFCC-S, TEO-S) or glottal waveforms (MFCC-G, TEO-G). A comparison with 6 different single-channel benchmark classifiers has shown that the OMC-DNN provided the best performance in both pair-wise (emotion vs. neutral) and simultaneous multiclass recognition of 7 emotions (anger, boredom, disgust, happiness, fear, sadness and neutral). In the pair-wise case, the OMC-DNN outperformed the single-channel DNN by 5%-10% depending on the feature set. In the multiclass case, the OMC-DNN outperformed or matched the singlechannel equivalents for all features. The speech spectrum and the glottal energy characteristics were identified as two important factors in discriminating between different types of categorical emotions in speech.
机译:这项研究使用一种称为优化多通道深度神经网络(OMC-DNN)的新方法研究了语音情感识别的有效性。该方法已通过输入特征(如代表MFCC图的简单2D黑白图像)进行了测试。从语音(MFCC-S,TEO-S)或声门波形(MFCC-G,TEO-G)计算出的系数或TEO参数。与6种不同的单通道基准分类器进行的比较表明,OMC-DNN在7种情绪(愤怒,无聊,厌恶,幸福,恐惧,悲伤和中立)。在成对情况下,取决于功能集,OMC-DNN的性能优于单通道DNN的5%-10%。在多类情况下,OMC-DNN的所有功能均优于或匹配单通道等效功能。语音频谱和声门能量特征被认为是区分语音中不同类别情感的两个重要因素。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号