首页> 外文会议>2017 9th IEEE-GCC Conference and Exhibition >A Two-Stage Hierarchical Multilingual Emotion Recognition System Using Hidden Markov Models and Neural Networks
【24h】

A Two-Stage Hierarchical Multilingual Emotion Recognition System Using Hidden Markov Models and Neural Networks

机译:基于隐马尔可夫模型和神经网络的两阶段分层多语言情感识别系统

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Speech emotion recognition continues to attract a lot of research especially under mixed language speech. Here, we show that emotion is culture/language dependent. In this paper, we propose a two-stage emotion recognition system that starts by identifying the language then using a dedicated language-dependent recognition system for identifying the type of emotion, The system is able to recognize accurately the four main types of emotion, namely Neutral, happy, angry, and sad. These types of emotion states are widely used in practical setups. To keep the computation complexity low, we identify the language using a feature vector consisting of energies from a basic wavelet decomposition of the speech signal. The Hidden Markov Model is then used to track the changes of this energy feature vector to identify the language achieving recognition of accuracy close to 100%. Once the language is identified, a set of traditional speech processing features including pitch, formats, MFCCs.… etc, are used with a basic Neural Network architecture to identify the type of emotion. The results show that that identifying the language first can substantially improve the overall accuracy in identifying emotions. The overall accuracy achieved with the proposed hierarchical system was above 93 %. The work shows the strong correlation between language/culture and type of emotion, and can further be extended to other scenarios such as gender-based recognition, facial-expression based recognition, age-based recognition … etc.
机译:语音情感识别继续吸引大量研究,尤其是在混合语言语音下。在这里,我们表明情感是文化/语言依赖性的。在本文中,我们提出了一个两阶段的情感识别系统,该系统首先识别语言,然后使用专用的依赖于语言的识别系统识别情感类型,该系统能够准确识别四种主要的情感类型,即中立,快乐,愤怒和悲伤。这些类型的情绪状态在实际设置中被广泛使用。为了保持较低的计算复杂度,我们使用由语音信号基本小波分解的能量组成的特征向量来识别语言。然后,使用隐马尔可夫模型来跟踪此能量特征向量的变化,以识别可实现接近100%的精度识别的语言。识别语言后,便会使用一组传统语音处理功能(包括音高,格式,MFCC等)与基本的神经网络体系结构一起使用,以识别情感类型。结果表明,首先识别语言可以大大提高识别情感的整体准确性。所提出的分级系统所实现的总体准确性高于93%。该作品显示了语言/文化与情感类型之间的强烈关联,并且可以进一步扩展到其他场景,例如基于性别的识别,基于面部表情的识别,基于年龄的识别等。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号