Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition

Nurul LUBIS; Dessi LESTARI; Sakriani SAKTI; Ayu PURWARIANTI; Satoshi NAKAMURA

首页> 外文期刊>IEICE transactions on information and systems >Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition

【24h】

Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition

机译：印尼电视脱口秀节目自发情感语料库的构建及其在多模态情感识别中的应用

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

As interaction between human and computer continues to develop to the most natural form possible, it becomes increasingly urgent to incorporate emotion in the equation. This paper describes a step toward extending the research on emotion recognition to Indonesian. The field continues to develop, yet exploration of the subject in Indonesian is still lacking. In particular, this paper highlights two contributions: (1) the construction of the first emotional audio-visual database in Indonesian, and (2) the first multimodal emotion recognizer in Indonesian, built from the aforementioned corpus. In constructing the corpus, we aim at natural emotions that are corresponding to real life occurrences. However, the collection of emotional corpora is notably labor intensive and expensive. To diminish the cost, we collect the emotional data from television programs recordings, eliminating the need of an elaborate recording set up and experienced participants. In particular, we choose television talk shows due to its natural conversational content, yielding spontaneous emotion occurrences. To cover a broad range of emotions, we collected three episodes in different genres: politics, humanity, and entertainment. In this paper, we report points of analysis of the data and annotations. The acquisition of the emotion corpus serves as a foundation in further research on emotion. Subsequently, in the experiment, we employ the support vector machine (SVM) algorithm to model the emotions in the collected data. We perform multimodal emotion recognition utilizing the predictions of three modalities: acoustic, semantic, and visual. When compared to the unimodal result, in the multimodal feature combination, we attain identical accuracy for the arousal at 92.6%, and a significant improvement for the valence classification task at 93.8%. We hope to continue this work and move towards a finer-grain, more precise quantification of emotion.

机译：随着人与计算机之间的交互作用不断发展到可能的最自然形式，将情感纳入等式变得越来越迫切。本文描述了将情感识别研究扩展到印尼语的步骤。该领域继续发展，但仍缺乏对印度尼西亚语主题的探索。特别是，本文重点介绍了两个方面的贡献：（1）印尼语中第一个情感视听数据库的构建;（2）根据上述语料库构建了印尼语中第一个多模式情感识别器。在构建语料库时，我们针对与现实生活中发生的事件对应的自然情感。然而，情感语料库的收集尤其是劳动密集的且昂贵的。为了降低成本，我们从电视节目录制中收集情感数据，从而无需精心制作的录制内容和经验丰富的参与者。特别是，我们选择电视脱口秀节目是由于其自然的对话内容，从而产生自发的情绪发生。为了涵盖广泛的情感，我们收集了三种不同类型的情节：政治，人文和娱乐。在本文中，我们报告了数据和注释的分析要点。情感语料的获取为情感的进一步研究奠定了基础。随后，在实验中，我们采用支持向量机（SVM）算法对收集的数据中的情绪进行建模。我们利用三种模态的预测来执行多模态情感识别：声学，语义和视觉。与单峰结果相比，在多峰特征组合中，唤醒的准确度为92.6％，价分类任务的显着提高为93.8％。我们希望继续进行这项工作，并朝着更细粒度，更精确的情感量化方向发展。

著录项

来源
《IEICE transactions on information and systems》 |2018年第8期|共9页
作者
Nurul LUBIS; Dessi LESTARI; Sakriani SAKTI; Ayu PURWARIANTI; Satoshi NAKAMURA;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Fine-grained emotion recognition: fusion of physiological signals and facial expressions on spontaneous emotion corpus [J] . Setiawan Feri, Prabono Aria Ghora, Khowaja Sunder Ali, International journal of ad hoc and ubiquitous computing . 2020,第3期

机译：细粒度的情感识别：生理信号和对自发情绪语料库的面部表达的融合
2. Multimodal information fusion application to human emotion recognition from face and speech [J] . Muharram Mansoorizadeh, Nasrollah Moghaddam Charkari Multimedia Tools and Applications . 2010,第2期

机译：多峰信息融合技术在人脸表情识别中的应用
3. Emotion recognition and its application to computer agents with spontaneous interactive capabilities [J] . R. Nakatsu, J. Nicholson, N. Tosa Knowledge-Based Systems . 2000,第7a8期

机译：情绪识别及其在具有自发交互功能的计算机代理中的应用
4. Emotion recognition on Indonesian television talk shows [C] . Lubis Nurul, Lestari Dessi, Purwarianti Ayu, IEEE Workshop on Spoken Language Technology . 2014

机译：印尼电视脱口秀节目中的情感识别
5. What we talk about when we talk about emotion: The rhetoric of emotion in composition [D] . Vogel, Elizabeth. 2008

机译：当我们谈论情感时我们谈论的是：构图中的情感修辞
6. TV vs. YouTube: TV Advertisements Capture More Visual Attention, Create More Positive Emotions and Have a Stronger Impact on Implicit Long-Term Memory [O] . David Weibel, Roman di Francesco, Roland Kopf, 2005

机译：电视与YouTube：电视广告可以吸引更多的视觉注意力，产生更多积极的情绪，并对内隐的长期记忆产生更大的影响
7. RAMAS: Russian Multimodal Corpus of Dyadic Interaction for studying emotion recognition [O] . Olga Perepelkina, Eva Kazimirova, Maria Konstantinova 2018

机译：RAMAS：俄罗斯多峰互动研究情感认可

Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition

摘要

著录项

相似文献

相关主题

期刊订阅