首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >AN ENSEMBLE FRAMEWORK OF VOICE-BASED EMOTION RECOGNITION SYSTEM FOR FILMS AND TV PROGRAMS
【24h】

AN ENSEMBLE FRAMEWORK OF VOICE-BASED EMOTION RECOGNITION SYSTEM FOR FILMS AND TV PROGRAMS

机译:电影和电视节目的基于语音情感识别系统的合奏框架

获取原文

摘要

Employing voice-based emotion recognition function in artificial intelligence (AI) product will improve the user experience. Most of researches that have been done only focus on the speech collected under controlled conditions. The scenarios evaluated in these research were well controlled. The conventional approach may fail when background noise or non-speech filler exist. In this paper, we propose an ensemble framework combining several aspects of features from audio. The framework incorporates gender and speaker information relying on multi-task learning. Therefore it is able to dig and capture emotional information as much as possible. This framework is evaluated on multimodal emotion challenge (MEC) 2017 corpus which is close to real world. The proposed framework outperformed the best baseline system by 29.5% (relative improvement).
机译:在人工智能(AI)产品中采用基于语音的情感识别功能将提高用户体验。已经完成的大多数研究仅关注在受控条件下收集的演讲。这些研究中评估的情景很好控制。当存在背景噪声或非语音填充时,传统方法可能会失败。在本文中,我们提出了一个组合框架,将来自音频的功能的若干方面组合起来。该框架包含依赖多任务学习的性别和演讲者信息。因此,它能够尽可能地挖掘和捕捉情绪信息。这一框架是在靠近现实世界的2017年核心挑战(MEC)挑战(MEC)。拟议的框架优先于最佳基线系统(相对改进)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号