首页> 外文期刊>Journal on multimodal user interfaces >Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis
【24h】

Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis

机译:基于表情,身体手势和声学分析的基于语音的交互中的多模式情感识别

获取原文
获取原文并翻译 | 示例

摘要

In this paper a study on multimodal automatic emotion recognition during a speech-based interaction is presented. A database was constructed consisting of people pronouncing a sentence in a scenario where they interacted with an agent using speech. Ten people pronounced a sentence corresponding to a command while making 8 different emotional expressions. Gender was equally represented, with speakers of several different native languages including French, German, Greek and Italian. Facial expression, gesture and acoustic analysis of speech were used to extract features relevant to emotion. For the automatic classification of unimodal data, bimodal data and multimodal data, a system based on a Bayesian classifier was used. After performing an automatic classification of each modality, the different modalities were combined using a multimodal approach. Fusion of the modalities at the feature level (before running the classifier) and at the results level (combining results from classifier from each modality) were compared. Fusing the multimodal data resulted in a large increase in the recognition rates in comparison to the unimodal systems: the multimodal approach increased the recognition rate byrnmore than 10% when compared to the most successful unimodal system. Bimodal emotion recognition based on all combinations of the modalities (i.e., 'face-gesture', 'face-speech' and 'gesture-speech') was also investigated. The results show that the best pairing is 'gesture-speech'. Using all three modalities resulted in a 3.3% classification improvement over the best bimodal results.
机译:本文提出了一种基于语音的交互过程中多模式自动情感识别的研究。构建了一个数据库,该数据库由人们在使用语音与特工进行交互的情况下发音的句子组成。十个人在做出8种不同的情感表达时发音了与命令相对应的句子。性别平等地得到了代表,讲法语,德语,希腊语和意大利语的几种不同的母语。使用面部表情,手势和语音声学分析来提取与情感相关的特征。对于单峰数据,双峰数据和多峰数据的自动分类,使用了基于贝叶斯分类器的系统。在对每个形态进行自动分类之后,使用多形态方法将不同形态组合在一起。比较了特征级别(在运行分类器之前)和结果级别(来自每个模态的分类器结果组合)中的模态融合。与单峰系统相比,融合多峰数据导致识别率大大提高:与最成功的单峰系统相比,多峰方法将识别率提高了10%以上。还研究了基于模态的所有组合(即``面部手势'',``面部语音''和``手势语音'')的双峰情感识别。结果表明,最佳配对是“手势语音”。与最佳双峰结果相比,使用这三种模态均导致3.3%的分类改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号