Speech emotion recognition with deep convolutional neural networks

Issa Dias; Demirci M. Fatih; Yazici Adnan

首页> 外文期刊>Biomedical signal processing and control >Speech emotion recognition with deep convolutional neural networks

【24h】

Speech emotion recognition with deep convolutional neural networks

机译：与深卷积神经网络的语音情感识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The speech emotion recognition (or, classification) is one of the most challenging topics in data science. In this work, we introduce a new architecture, which extracts mel-frequency cepstral coefficients, chromagram, mel-scale spectrogram, Tonnetz representation, and spectral contrast features from sound files and uses them as inputs for the one-dimensional Convolutional Neural Network for the identification of emotions using samples from the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Berlin (EMO-DB), and Interactive Emotional Dyadic Motion Capture (IEMOCAP) datasets. We utilize an incremental method for modifying our initial model in order to improve classification accuracy. All of the proposed models work directly with raw sound data without the need for conversion to visual representations, unlike some previous approaches. Based on experimental results, our best-performing model outperforms existing frameworks for RAVDESS and IEMOCAP, thus setting the new state-of-the-art. For the EMO-DB dataset, it outperforms all previous works except one but compares favorably with that one in terms of generality, simplicity, and applicability. Specifically, the proposed framework obtains 71.61% for RAVDESS with 8 classes, 86.1% for EMO-DB with 535 samples in 7 classes, 95.71% for EMO-DB with 520 samples in 7 classes, and 64.3% for IEMOCAP with 4 classes in speaker-independent audio classification tasks. (C) 2020 Elsevier Ltd. All rights reserved.

机译：语音情感认可（或分类）是数据科学中最具挑战性的主题之一。在这项工作中，我们介绍了一种新的架构，其从声音文件中提取熔融频率谱系齐数，Chernagram，Mel-Scal谱图，吨位表示和光谱对比度特征，并将它们用作一维卷积神经网络的输入使用来自情绪语音和歌曲（Ravdess），柏林（EMO-DB）和交互式情绪二进制运动捕获（IEMocap）数据集的ryerson视听数据库中使用样本的识别情绪。我们利用一个增量方法来修改我们的初始模型，以提高分类准确性。与某些先前的方法不同，所有拟议的模型都直接使用原始声音数据，而无需转换为可视化表示。基于实验结果，我们最好的模型优于Ravdess和Iemocap的现有框架，从而设定了新的最先进。对于EMO-DB数据集，除了一个以外的所有上一件工作，除了一个，而且在一般性，简单性和适用性方面比较有利地比较。具体而言，拟议的框架为RAVDES获得71.61％，对于8级，eMO-DB的86.1％，7级样品，95.71％，emo-dB，7个类别中的520个样本，Iemocap在扬声器中有4个课程的64.3％ - 独立的音频分类任务。（c）2020 elestvier有限公司保留所有权利。

著录项

来源
《Biomedical signal processing and control》 |2020年第5期|101894.1-101894.11|共11页
作者
Issa Dias; Demirci M. Fatih; Yazici Adnan;
展开▼
作者单位

Nazarbayev Univ Dept Comp Sci Nur Sultan 010000 Kazakhstan;

Nazarbayev Univ Dept Comp Sci Nur Sultan 010000 Kazakhstan;

Nazarbayev Univ Dept Comp Sci Nur Sultan 010000 Kazakhstan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech emotion recognition; Deep learning; Signal processing;

机译：语音情感识别;深入学习;信号处理;

相似文献

外文文献
中文文献
专利

1. Learning Deep Binaural Representations With Deep Convolutional Neural Networks for Spontaneous Speech Emotion Recognition [J] . Zhang Shiqing, Chen Aihua, Guo Wenping, Quality Control, Transactions . 2020,第期

机译：学习深层卷积神经网络的深层双耳陈述，用于自发言论情绪识别
2. Deep and shallow features fusion based on deep convolutional neural network for speech emotion recognition [J] . Linhui Sun, Jia Chen, Keli Xie, International journal of speech technology . 2018,第4期

机译：基于深度卷积神经网络的深浅特征融合在语音情感识别中的应用
3. Optimal feature selection based speech emotion recognition using two-stream deep convolutional neural network [J] . Mustaqeem, Soonil Kwon International Journal of Intelligent Systems . 2021,第9期

机译：基于最优特征选择的语音情绪识别使用两流深卷积神经网络
4. Speech Emotion Recognition using Convolution Neural Networks and Deep Stride Convolutional Neural Networks [C] . Taiba Majid Wani, Teddy Surya Gunawan, Syed Asif Ahmad Qadri, International Conference on Wireless and Telematics . 2020

机译：使用卷积神经网络和深度跨步卷积神经网络的语音情感识别
5. Emotion Recognition Using Deep Convolutional Neural Network with Large Scale Physiological Data [D] . Sharma, Astha. 2018

机译：基于深度卷积神经网络的大规模生理数据情感识别
6. Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition [O] . Hua Zhang, Ruoyun Gou, Jili Shang, 2021

机译：训练的深度卷积神经网络模型注意语音情感识别
7. Cross-Corpus Speech Emotion Recognition Based on Deep Domain-Adaptive Convolutional Neural Network [O] . Jiateng LIU, Wenming ZHENG, Yuan ZONG, 2020

机译：基于深域自适应卷积神经网络的交叉语料库语音情感识别

Speech emotion recognition with deep convolutional neural networks

摘要

著录项

相似文献

相关主题

期刊订阅