Towards Symmetric Multimodality: Fusion and Fission of Speech, Gesture, and Facial Expression

机译：朝向对称多模式：言语，手势和面部表情的融合和裂变

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce the notion of symmetric multimodality for dialogue systems in which all input modes (eg. speech, gesture, facial expression) are also available for output, and vice versa. A dialogue system with symmetric multimodality must not only understand and represent the user's multimodal input, but also its own multimodal output. We present the SmartKom system, that provides full symmetric multimodality in a mixed-initiative dialogue system with an embodied conversational agent. SmartKom represents a new generation of multimodal dialogue systems, that deal not only with simple modality integration and synchronization, but cover the full spectrum of dialogue phenomena that are associated with symmetric multimodality (including crossmodal references, one-anaphora, and backchannelling). We show that SmartKom's plup-an-play architecture supports multiple recognizers for a single modality, eg. the user's speech signal can be processed by three unimodal recognizers in parallel (speech recognition, emotional prosody, boundary prosody). Finally, we detail SmartKom's three-tiered representation of multimodal discourse, consisting of a domain layer, a discourse layer, and a modality layer.

机译：我们介绍对话系统的对称多模的概念，其中所有输入模式（例如，语音，手势，面部表情）也可用于输出，反之亦然。具有对称多模式的对话系统不仅必须理解和代表用户的多模式输入，而且不仅可以理解和代表其自己的多模式输出。我们介绍了SmartKom系统，该系统提供了一个具有体现的会话系统的混合主动对话系统中的完全对称的多模。 SmartKom代表了新一代的多模式对话系统，不仅涉及简单的模态集成和同步，而且涵盖与对称多模（包括跨型参考，单个Anaphora和Backhonnelling）相关的对话现象。我们展示SmartKOM的Plup-An-Plaster架构支持多个识别器，例如单个模态，例如。用户的语音信号可以并行地由三个单峰识别器处理（语音识别，情绪韵律，边界韵律）。最后，我们详细介绍了SmartKOM的三层话语的三层表示，由域层，话语层和模态层组成。

著录项

来源
《German Conference on Artificial Intelligence》|2003年||共18页
会议地点
作者
Wolfgang Wahlster;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-532;
关键词

相似文献

外文文献
中文文献
专利

1. Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis [J] . Loic Kessous, Ginevra Castellano, George Caridakis Journal on multimodal user interfaces . 2010,第1a2期

机译：基于表情，身体手势和声学分析的基于语音的交互中的多模式情感识别
2. Feature Fusion Algorithm for Multimodal Emotion Recognition from Speech and Facial Expression Signal [J] . Zhiyan Hana, Jian Wang MATEC Web of Conferences . 2016,第7期

机译：特征融合算法的语音和表情信号多态情感识别
3. On creating multimodal virtual humans-real time speech driven facial gesturing [J] . Goranka Zoric, Rober Forchheimer, Igor S. Pandzic Multimedia Tools and Applications . 2011,第1期

机译：关于创建多模式虚拟人实时语音驱动的面部手势
4. Towards Symmetric Multimodality: Fusion and Fission of Speech, Gesture, and Facial Expression [C] . Wolfgang Wahlster German Conference on Artificial Intelligence . 2003

机译：朝向对称多模式：言语，手势和面部表情的融合和裂变
5. The role of facial gestural information in supporting perceptual learning of degraded speech. [D] . Wayne, Rachel Victoria. 2011

机译：面部手势信息在支持退化语音感知学习中的作用。
6. Quantifying the speech-gesture relation with massive multimodal datasets: Informativity in time expressions [O] . Cristóbal Pagán Cánovas, Javier Valenzuela, Daniel Alcaraz Carrión, 2020

机译：使用大量的多模态数据集量化语音-手势关系：时间表达式中的信息性
7. Towards Symmetric Multimodality: Fusion and Fission of Speech, Gesture, and Facial Expression [O] . Wolfgang Wahlster 2003

机译：走向对称多模态：融合与言语，手势和面部表情的裂变

Towards Symmetric Multimodality: Fusion and Fission of Speech, Gesture, and Facial Expression

摘要

著录项

相似文献

相关主题

期刊订阅