首页> 外文期刊>IEICE Transactions on fundamentals of electronics, communications & computer sciences >A Multitask Learning Approach Based on Cascaded Attention Network And Self-Adaption Loss for Speech Emotion Recognition

A Multitask Learning Approach Based on Cascaded Attention Network And Self-Adaption Loss for Speech Emotion Recognition


获取原文并翻译 | 示例


Speech emotion recognition (SER) has been a complexand difficult task for a long time due to emotional complexity. In this paper,we propose a multitask deep learning approach based on cascaded attentionnetwork and self-adaption loss for SER. Frist, non-personalized featuresare extracted to represent the process of emotion change while reducingexternal variables’ influence. Second, to highlight salient speech emotionfeatures, a cascade attention network is proposed, where spatial temporalattention can effectively locate the regions of speech that express emotion,while self-attention reduces the dependence on external information.Finally, the influence brought by the differences in gender and human perceptionof external information is alleviated by using a multitask learningstrategy, where a self-adaption loss is introduced to determine the weightsof different tasks dynamically. Experimental results on IEMOCAP datasetdemonstrate that our method gains an absolute improvement of 1.97 and0.91 over state-of-the-art strategies in terms of weighted accuracy (WA)and unweighted accuracy (UA), respectively.




京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号