首页> 外文会议>IEEE International Midwest Symposium on Circuits and Systems >Video-Audio Emotion Recognition Based on Feature Fusion Deep Learning Method
【24h】

Video-Audio Emotion Recognition Based on Feature Fusion Deep Learning Method

机译:基于特征融合深度学习方法的视音频情感识别

获取原文

摘要

In this paper, we propose a video-audio based emotion recognition system in order to improve the successive classification rate. The features from audio frames are extracted using Mel frequency Cepstral coefficients (MFCC) while the features from video frames are extracted from VGG16 with pre-trained weights on the ImageNet dataset [17]. Then recurrent neural networks (RNN) are further applied to process the sequence information. The outputs of both RNN are fused into a concatenate layer and then the final classification result is obtained by the softmax layer. Our proposed system achieves 90% accuracy based on the RAVDESS dataset for eight emotion classes.
机译:为了提高后续分类率,本文提出了一种基于视频音频的情感识别系统。音频帧的特征使用Mel频率倒谱系数(MFCC)提取,而视频帧的特征则使用ImageNet数据集上预先训练的权重从VGG16中提取[17]。然后进一步应用递归神经网络(RNN)对序列信息进行处理。将两个RNN的输出融合成一个级联层,然后通过softmax层获得最终的分类结果。我们提出的系统在RAVDESS数据集的基础上达到了90%的准确率,用于八种情绪类别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号