Sequence-to-sequence Modelling for Categorical Speech Emotion Recognition Using Recurrent Neural Network

机译：基于递归神经网络的分类语音情感识别的序列化建模

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

To model the categorical speech emotion recognition tasks in a sequential approach, the first challenge is how to transfer the categorical label for each utterance into a label sequence. To settle this, we make a hypothesis that an utterance is consisting of emotional and non-emotional segments alternatively, and these non-emotional segments correspond to silent regions, short pauses, transits between phonemes, fricative phonemes, etc. With this hypothesis, we propose to treat an utterance, 's label sequence as a chain of two kinds of states: emotional states denoting emotional frames and Nulls denoting non-emotional frames. Then, we exploit a connectionist temporal classification based recurrent neural network (CTC-RNN) to automatically label and align an utterance's emotional segments with emotional labels, while non-emotional segments with non-emotional labels. Experimental results on the IEMOCAP corpus demonstrate the effectiveness of our proposed method compared to state-of-the-art emotion recognition algorithms.

机译：为了用顺序方法对分类语音情感识别任务进行建模，第一个挑战是如何将每种话语的分类标签转换为标签序列。为了解决这个问题，我们做出一个假设，即话语由情感和非情感性片段组成，这些非情感性片段对应于静默区域，短暂的停顿，音素之间的过渡，摩擦音素等。建议将发话标签的序列视为两种状态的链：情绪状态表示情绪框架，空值表示非情绪框架。然后，我们利用基于连接器的时间分类的递归神经网络（CTC-RNN）自动标记并对齐带有情感标签的话语情感段，而将非情感段与非情感标签对齐。与最先进的情绪识别算法相比，IEMOCAP语料库上的实验结果证明了我们提出的方法的有效性。

著录项

来源
《2018 First Asian Conference on Affective Computing and Intelligent Interaction》|2018年|1-6|共6页
会议地点 Beijing(CN)
作者
Xiaomin Chen; Wenjing Han; Huabin Ruan; Jiamu Liu; Haifeng Li; Dongmei Jiang;
展开▼
作者单位

School of Computing Science and Technology, Harbin Institute of Technology, Harbin, China;

Samsung RD Institute China-Beijing (SRC-Beijing), Beijing, China;

Protein Research Technology Center Tsinghua University, Beijing, China;

School of Computing Science Northwestern Polytechnical University, Xi’an, China;

School of Computing Science and Technology Harbin Institute of Technology Harbin, China;

School of Computing Science Northwestern Polytechnical University Xi’an, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech recognition; Logic gates; Emotion recognition; Task analysis; Recurrent neural networks; Feature extraction; Markov processes;

机译：语音识别逻辑门情感识别任务分析递归神经网络特征提取马尔可夫过程;
入库时间 2022-08-26 14:26:36

相似文献

外文文献
中文文献
专利

1. 3-D Convolutional Recurrent Neural Networks With Attention Model for Speech Emotion Recognition [J] . Mingyi Chen, Xuanji He, Jing Yang, IEEE signal processing letters . 2018,第10期

机译：具有注意力模型的3-D卷积递归神经网络用于语音情感识别
2. Speech emotion recognition using recurrent neural networks with directional self-attention [J] . Li Dongdong, Liu Jinlin, Yang Zhuo, Expert systems with applications . 2021,第Jula期

机译：使用反复性神经网络具有定向自我关注的语音情感识别
3. Emotion recognition from speech using deep recurrent neural networks with acoustic features [J] . Byun Sung-Woo, Shin Bo-Ra, Lee Seok-Pil, Basic & clinical pharmacology & toxicology. . 2019,第S7期

机译：使用深度经常性神经网络具有声学特征的情感认识
4. Sequence-to-sequence Modelling for Categorical Speech Emotion Recognition Using Recurrent Neural Network [C] . Xiaomin Chen, Wenjing Han, Huabin Ruan, Asian Conference on Affective Computing and Intelligent Interaction . 2018

机译：用复制神经网络进行分类语音情感识别的顺序序列建模
5. Modeling and learning in speech recognition: The relationship between stochastic pattern classifiers and neural networks. [D] . Niles, Leslie Thomas. 1991

机译：语音识别中的建模和学习：随机模式分类器与神经网络之间的关系。
6. Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition [O] . Hua Zhang, Ruoyun Gou, Jili Shang, 2021

机译：训练的深度卷积神经网络模型注意语音情感识别
7. Ensemble Learning With Attention-Integrated Convolutional Recurrent Neural Network for Imbalanced Speech Emotion Recognition [O] . Xusheng Ai, Victor S. Sheng, Wei Fang, 2020

机译：与关注集成卷积经常性神经网络的合奏学习，用于不平衡语音情感识别

Sequence-to-sequence Modelling for Categorical Speech Emotion Recognition Using Recurrent Neural Network

摘要

著录项

相似文献

相关主题

期刊订阅