首页> 外文会议>European Conference on Artificial Intelligence;Conference on Prestigious Applications of Intelligent Systems >End-To-End Speech Emotion Recognition Based on Time and Frequency Information Using Deep Neural Networks
【24h】

End-To-End Speech Emotion Recognition Based on Time and Frequency Information Using Deep Neural Networks

机译:基于使用深神经网络的时间和频率信息的端到端语音情感识别

获取原文

摘要

We propose a speech emotion recognition system based on deep neural networks, operating on raw speech data in an end-to-end manner to predict continuous emotions in arousal-valence space. The model is trained using time and frequency information of speech recordings of the publicly available part of the multi-modal RECOLA database. We use the Concordance Correlation Coefficient (CCC) as it was proposed by the Audio-Visual Emotion Challenges to measure the similarity between the network prediction and gold-standard. The CCC prediction results of our model outperform the results achieved by other state-of-the-art end-to-end models. The innovative aspect of our study is an end-to-end approach to using data that previously was mostly used by approaches involving combinations of pre-processing or post-processing. Our study used only a small subset of the RECOLA dataset and obtained better results than previous studies that used the full dataset.
机译:我们提出了一种基于深度神经网络的语音情感识别系统,以最终的方式在原始语音数据上运营,以预测唤醒型空间中的连续情绪。 使用多模态Recola数据库的公开部分的语音记录的时间和频率信息培训该模型。 我们使用奇异相关系数(CCC),因为视听情绪挑战提出,以衡量网络预测和金标之间的相似性。 我们的模型的CCC预测结果优于其他最先进的端到端模型所实现的结果。 我们研究的创新方面是使用先前主要用于涉及预处理或后处理组合的方法的数据的端到端方法。 我们的研究仅使用了Recola数据集的一个小子集,并获得了比使用完整数据集的先前研究更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号