首页> 外文会议>8th World Multi-Conference on Systemics, Cybernetics and Informatics(SCI 2004) vol.12: Applications of Cybernetics and Informatics in Optics, Signals, Science and Engineering >Intelligent Ear for Emotion Recognition: Multi-Modal Emotion Recognition via Acoustic Features, Semantic Contents and Facial Images
【24h】

Intelligent Ear for Emotion Recognition: Multi-Modal Emotion Recognition via Acoustic Features, Semantic Contents and Facial Images

机译:用于情感识别的智能耳:通过声学特征,语义内容和面部图像进行的多模式情感识别

获取原文
获取原文并翻译 | 示例

摘要

In this paper, based on the idea that humans are capable of detecting human emotions during a conversation through speech and facial expression input, an emotion recognition system that can detect the emotion from acoustic features, semantic contents, and facial expression during conversation is proposed. In the analysis of speech signals, thirty-three acoustic features are extracted from the speech input. After Principle Component Analysis (PCA), 14 principle components are selected for discriminative representation. In this representation each principle component. is the combination of the 33 original acoustic features and forms a feature subspace. The Support Vector Machines (SVMs) are adopted to classify the emotional states. In facial emotion recognition module, the facial image captured from CCD is provided for facial image feature extraction. An SVM model is applied for emotion recognition. Finally in text analysis, all emotional keywords and emotion modification words are manually defined. The emotion intensity levels of emotional keywords and emotion modification words are estimated from a collected emotion corpus. The final emotional state is determined based on the emotion outputs from these three modules. The experimental result shows that the emotion recognition accuracy of the integrated system is better than each of the three individual approaches.
机译:在本文中,基于人类能够通过语音和面部表情输入来检测对话中的人类情感的想法,提出了一种能够从对话中的声学特征,语义内容和面部表情中检测出情感的情感识别系统。在语音信号分析中,从语音输入中提取了33个声学特征。在主成分分析(PCA)之后,选择了14个主成分进行判别表示。在此表示中,每个主要组成部分。是33个原始声学特征的组合,并形成一个特征子空间。采用支持向量机(SVM)对情绪状态进行分类。在面部情感识别模块中,提供了从CCD捕获的面部图像用于面部图像特征提取。支持向量机模型用于情感识别。最后,在文本分析中,所有情感关键词和情感修饰词都是手动定义的。情绪关键词和情绪修饰词的情绪强度水平是从收集到的情绪语料库中估计的。根据这三个模块的情绪输出确定最终的情绪状态。实验结果表明,集成系统的情感识别精度优于三种方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号