A Multitask Learning Approach Based on Cascaded Attention Network And Self-Adaption Loss for Speech Emotion Recognition

Yang LIU; Yuqi XIA; Haoqin SUNXiaolei MENGJianxiong BAIWenbo GUANZhen ZHAOYongwei LI

首页> 外文期刊>IEICE Transactions on fundamentals of electronics, communications & computer sciences >A Multitask Learning Approach Based on Cascaded Attention Network And Self-Adaption Loss for Speech Emotion Recognition

【24h】

A Multitask Learning Approach Based on Cascaded Attention Network And Self-Adaption Loss for Speech Emotion Recognition

机译：一种基于级联注意力网络和自适应损失的语音情感识别多任务学习方法

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相关主题

摘要

Speech emotion recognition (SER) has been a complexand difficult task for a long time due to emotional complexity. In this paper,we propose a multitask deep learning approach based on cascaded attentionnetwork and self-adaption loss for SER. Frist, non-personalized featuresare extracted to represent the process of emotion change while reducingexternal variables’ influence. Second, to highlight salient speech emotionfeatures, a cascade attention network is proposed, where spatial temporalattention can effectively locate the regions of speech that express emotion,while self-attention reduces the dependence on external information.Finally, the influence brought by the differences in gender and human perceptionof external information is alleviated by using a multitask learningstrategy, where a self-adaption loss is introduced to determine the weightsof different tasks dynamically. Experimental results on IEMOCAP datasetdemonstrate that our method gains an absolute improvement of 1.97 and0.91 over state-of-the-art strategies in terms of weighted accuracy (WA)and unweighted accuracy (UA), respectively.

机译：长期以来，由于情绪的复杂性，语音情感识别（SER）一直是一项复杂而艰巨的任务。在本文中，我们提出了一种基于级联注意力网络和自适应损失的SER多任务深度学习方法，首先提取非个性化特征来表示情绪变化的过程，同时减少外部变量的影响。其次，为了突出突出言语情感特征，提出了一种级联注意力网络，其中时空注意力可以有效地定位表达情感的言语区域，而自我注意力可以减少对外部信息的依赖。最后，通过使用多任务学习策略，引入自我适应损失来动态确定不同任务的权重，从而缓解性别差异和人类对外部信息感知的影响。在IEMOCAP数据集上的实验结果表明，与现有策略相比，该方法在加权精度（WA）和未加权精度（UA）方面分别获得了1.97%和0.91%的绝对改进。

著录项

来源
《IEICE Transactions on fundamentals of electronics, communications & computer sciences》 |2023年第2期|a1-10|共11页
作者
Yang LIU; Yuqi XIA; Haoqin SUNXiaolei MENGJianxiong BAIWenbo GUANZhen ZHAOYongwei LI;
展开▼
作者单位

school of Information Science and technology,Qingdao University of Science and Technology, Qingdao266061, China;

national Laboratory of Pattern Recognition,Institute of Automation, Chinese Academy of Sciences, Beijing100089, China;

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类
关键词
Speech Emotion Recognition; Non-Personalized Features; Cascaded Attention Network; Multitask Learning; Self-Adaption Loss;

机译：语音情感识别;非个性化功能;级联注意力网络;多任务学习;自适应损耗;

A Multitask Learning Approach Based on Cascaded Attention Network And Self-Adaption Loss for Speech Emotion Recognition

摘要

著录项

引文网络

相关主题

期刊订阅