首页> 外文期刊>Information Processing & Management >Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition
【24h】

Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition

机译:通过利用基于注意力的双向LSTM-RNN进行多模式情感识别来探索时间表示

获取原文
获取原文并翻译 | 示例
       

摘要

Emotional recognition contributes to automatically perceive the user's emotional response to multimedia content through implicit annotation, which further benefits establishing effective user-centric services. Physiological-based ways have increasingly attract researcher's attention because of their objectiveness on emotion representation. Conventional approaches to solve emotion recognition have mostly focused on the extraction of different kinds of hand-crafted features. However, hand-crafted feature always requires domain knowledge for the specific task, and designing the proper features may be more time consuming. Therefore, exploring the most effective physiological-based temporal feature representation for emotion recognition becomes the core problem of most works. In this paper, we proposed a multimodal attention-based BLSTM network framework for efficient emotion recognition. Firstly, raw physiological signals from each channel are transformed to spectrogram image for capturing their time and frequency information. Secondly, Attention-based Bidirectional Long Short-Term Memory Recurrent Neural Networks (LSTM-RNNs) are utilized to automatically learn the best temporal features. The learned deep features are then fed into a deep neural network (DNN) to predict the probability of emotional output for each channel. Finally, decision level fusion strategy is utilized to predict the final emotion. The experimental results on AMIGOS dataset show that our method outperforms other state of art methods.
机译:情感识别有助于通过隐式注释自动感知用户对多媒体内容的情感响应,这进一步有助于建立有效的以用户为中心的服务。基于生理的方式因其对情感表示的客观性而越来越引起研究者的关注。解决情感识别的传统方法主要集中在提取各种手工制作的特征上。但是,手工制作的功能始终需要特定任务的领域知识,而设计适当的功能可能会更耗时。因此,探索用于情感识别的最有效的基于生理的时间特征表示成为大多数作品的核心问题。在本文中,我们提出了一种基于多模式注意力的BLSTM网络框架,用于有效的情绪识别。首先,将来自每个通道的原始生理信号转换为频谱图图像,以捕获其时间和频率信息。其次,基于注意力的双向长期短期记忆递归神经网络(LSTM-RNN)用于自动学习最佳时态特征。然后将学习到的深度特征输入到深度神经网络(DNN)中,以预测每个通道的情绪输出概率。最后,决策级融合策略被用来预测最终的情绪。在AMIGOS数据集上的实验结果表明,我们的方法优于其他现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号