首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network
【24h】

Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network

机译:Adieu功能?使用深度卷积递归网络的端到端语音情感识别

获取原文

摘要

The automatic recognition of spontaneous emotions from speech is a challenging task. On the one hand, acoustic features need to be robust enough to capture the emotional content for various styles of speaking, and while on the other, machine learning algorithms need to be insensitive to outliers while being able to model the context. Whereas the latter has been tackled by the use of Long Short-Term Memory (LSTM) networks, the former is still under very active investigations, even though more than a decade of research has provided a large set of acoustic descriptors. In this paper, we propose a solution to the problem of ???context-aware??? emotional relevant feature extraction, by combining Convolutional Neural Networks (CNNs) with LSTM networks, in order to automatically learn the best representation of the speech signal directly from the raw time representation. In this novel work on the so-called end-to-end speech emotion recognition, we show that the use of the proposed topology significantly outperforms the traditional approaches based on signal processing techniques for the prediction of spontaneous and natural emotions on the RECOLA database.
机译:自动识别语音中的自发情绪是一项艰巨的任务。一方面,声学特征必须足够健壮,以捕获各种说话风格的情感内容,另一方面,机器学习算法需要对异常值不敏感,同时能够对上下文进行建模。尽管后者已通过使用长短期记忆(LSTM)网络解决,但前者仍处于非常积极的研究之中,尽管十多年来的研究已经提供了大量声学描述符。在本文中,我们提出了解决“上下文感知”问题的方法。通过将卷积神经网络(CNN)与LSTM网络相结合,提取情感相关特征,以便直接从原始时间表示中自动学习语音信号的最佳表示。在关于端到端语音情感识别的这项新颖工作中,我们表明,所提出的拓扑的使用明显优于基于信号处理技术的传统方法,该方法可用于预测RECOLA数据库上的自然和自然情感。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号