Learning utterance-level representations for speech emotion and age/gender recognition using deep neural networks

机译：使用深度神经网络学习语音表达和年龄/性别识别的话语级表示

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Accurately recognizing speaker emotion and age/gender from speech can provide better user experience for many spoken dialogue systems. In this study, we propose to use deep neural networks (DNNs) to encode each utterance into a fixed-length vector by pooling the activations of the last hidden layer over time. The feature encoding process is designed to be jointly trained with the utterance-level classifier for better classification. A kernel extreme learning machine (ELM) is further trained on the encoded vectors for better utterance-level classification. Experiments on a Mandarin dataset demonstrate the effectiveness of our proposed methods on speech emotion and age/gender recognition tasks.

机译：从语音中准确识别说话人的情绪和年龄/性别可以为许多口语对话系统提供更好的用户体验。在这项研究中，我们建议使用深度神经网络（DNN）通过合并最后隐藏层随时间的激活来将每个话语编码为固定长度矢量。将特征编码过程设计为与发声级分类器共同训练，以实现更好的分类。内核极限学习机（ELM）进一步在编码矢量上进行了训练，以实现更好的发声级分类。在普通话数据集上进行的实验证明了我们提出的方法在语音情感和年龄/性别识别任务上的有效性。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2017年|5150-5154|共5页
会议地点
作者
Zhong-Qiu Wang; Ivan Tashev;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Emotion recognition; Speech recognition; Speech; Kernel; Training; Detectors; Feature extraction;

机译：情绪识别;语音识别;语音;内核;训练;检测器;特征提取;

相似文献

外文文献
中文文献
专利

1. Learning Deep Binaural Representations With Deep Convolutional Neural Networks for Spontaneous Speech Emotion Recognition [J] . Zhang Shiqing, Chen Aihua, Guo Wenping, Quality Control, Transactions . 2020,第期

机译：学习深层卷积神经网络的深层双耳陈述，用于自发言论情绪识别
2. Recognition of speech emotion using custom 2D-convolution neural network deep learning algorithm [J] . Zvarevashe Kudakwashe, Olugbara Oludayo O. Intelligent data analysis . 2020,第5期

机译：使用自定义2D卷积神经网络深度学习算法识别语音情绪
3. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks [J] . Morten Kolbæk, Dong Yu, Zheng-Hua Tan, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017,第10期

机译：深度递归神经网络的话语水平置换不变训练的多说话人语音分离
4. Learning utterance-level representations for speech emotion and age/gender recognition using deep neural networks [C] . Zhong-Qiu Wang, Ivan Tashev IEEE International Conference on Acoustics, Speech and Signal Processing . 2017

机译：使用深神经网络学习语音情感和年龄/性别识别的话语级别表示
5. Multi-task learning deep neural networks for automatic speech recognition [D] . Chen, Dongpeng. 2015

机译：多任务学习深度神经网络自动语音识别
6. Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition [O] . Hua Zhang, Ruoyun Gou, Jili Shang, 2021

机译：训练的深度卷积神经网络模型注意语音情感识别
7. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks [O] . Kolbæk, Morten, Yu, Dong, Tan, Zheng-Hua, 2017

机译：multiveker语音分离与深度递归神经网络的话语级置换不变训练

Learning utterance-level representations for speech emotion and age/gender recognition using deep neural networks

摘要

著录项

相似文献

相关主题

期刊订阅