Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds

机译：考虑口头和非语言语音声音，使用深神经网络的语音情感识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speech emotion recognition is becoming increasingly important for many applications. In real-life communication, non-verbal sounds within an utterance also play an important role for people to recognize emotion. In current studies, only few emotion recognition systems considered nonverbal sounds, such as laughter, cries or other emotion interjection, which naturally exists in our daily conversation. In this work, both verbal and nonverbal sounds within an utterance were thus considered for emotion recognition of real-life conversations. Firstly, an SVM-based verbal/nonverbal sound detector was developed. A Prosodic Phrase (PPh) auto-tagger was further employed to extract the verbal/nonverbal segments. For each segment, the emotion and sound features were respectively extracted based on convolutional neural networks (CNNs) and then concatenated to form a CNN-based generic feature vector. Finally, a sequence of CNN-based feature vectors for an entire dialog turn was fed to an attentive long short-term memory (LSTM)-based sequence-to-sequence model to output an emotional sequence as recognition result. Experimental results on the recognition of seven emotional states in the NNIME (The NTHU-NTUA Chinese interactive multimodal emotion corpus) showed that the proposed method achieved a detection accuracy of 52.00% outperforming the traditional methods.

机译：语音情感识别对于许多应用来说越来越重要。在现实生活中，话语中的非口头声音也为人们认识到情绪发挥着重要作用。在目前的研究中，只有很少的情感识别系统被认为是非语言的声音，例如笑声，哭泣或其他情感互动，这自然存在于我们的日常谈话中。在这项工作中，言语内的口头和非语言的声音都被认为是对现实谈话的情感认可。首先，开发了基于SVM的口头/非语言声探测器。进一步采用韵律短语（PPH）自动标记器来提取口头/非语言段。对于每个段，基于卷积神经网络（CNNS）分别提取情绪和声音特征，然后连接以形成基于CNN的通用特征向量。最后，为整个对话框转向的基于CNN的特征向量序列被馈送到分别的长短期存储器（LSTM）的基于序列到序列模型，以将情绪序列输出为识别结果。实验结果对Nnime中七种情绪状态的识别（Nthu-NTUA中华互动多媒体情绪语料库）表明，所提出的方法达到了52.00％优于传统方法的检测精度。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2019年|p5331-5995|共5页
会议地点
作者
Kun-Yi Huang; Chung-Hsien Wu; Qian-Bei Hong; Ming-Hsiang Su; Yi-Hsuan Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
Speech emotion recognition; prosodic Phrase; nonverbal segment; convolutional neural network; long-short term memory; sequence-to-sequence model;

机译：语音情感识别;韵律短语;非语言段;卷积神经网络;长短术语记忆;序列到序列模型;

相似文献

外文文献
中文文献
专利

1. Learning Deep Binaural Representations With Deep Convolutional Neural Networks for Spontaneous Speech Emotion Recognition [J] . Zhang Shiqing, Chen Aihua, Guo Wenping, Quality Control, Transactions . 2020,第期

机译：学习深层卷积神经网络的深层双耳陈述，用于自发言论情绪识别
2. Deep and shallow features fusion based on deep convolutional neural network for speech emotion recognition [J] . Linhui Sun, Jia Chen, Keli Xie, International journal of speech technology . 2018,第4期

机译：基于深度卷积神经网络的深浅特征融合在语音情感识别中的应用
3. Speech emotion recognition with deep convolutional neural networks [J] . Issa Dias, Demirci M. Fatih, Yazici Adnan Biomedical signal processing and control . 2020,第May期

机译：与深卷积神经网络的语音情感识别
4. Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds [C] . Kun-Yi Huang, Chung-Hsien Wu, Qian-Bei Hong, IEEE International Conference on Acoustics, Speech and Signal Processing . 2019

机译：考虑语言和非语言语音的深度神经网络语音情感识别
5. Dysarthric Speech Recognition and Offline Handwriting Recognition using Deep Neural Networks. [D] . Pillai, Suhas Balkrishna. 2017

机译：使用深度神经网络的表情异常语音识别和离线手写识别。
6. Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition [O] . Hua Zhang, Ruoyun Gou, Jili Shang, 2021

机译：训练的深度卷积神经网络模型注意语音情感识别
7. Speech Emotion Recognition Using Deep Neural Networks on Multilingual Databases [O] . Syed Asif Ahmad Qadri, Teddy Surya Gunawan, Taiba Majid Wani, 2021

机译：语音情感识别在多语言数据库中使用深神经网络

Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds

摘要

著录项

相似文献

相关主题

期刊订阅