首页> 外文会议>German Conference on Artificial Intelligence >Towards Symmetric Multimodality: Fusion and Fission of Speech, Gesture, and Facial Expression
【24h】

Towards Symmetric Multimodality: Fusion and Fission of Speech, Gesture, and Facial Expression

机译:朝向对称多模式:言语,手势和面部表情的融合和裂变

获取原文

摘要

We introduce the notion of symmetric multimodality for dialogue systems in which all input modes (eg. speech, gesture, facial expression) are also available for output, and vice versa. A dialogue system with symmetric multimodality must not only understand and represent the user's multimodal input, but also its own multimodal output. We present the SmartKom system, that provides full symmetric multimodality in a mixed-initiative dialogue system with an embodied conversational agent. SmartKom represents a new generation of multimodal dialogue systems, that deal not only with simple modality integration and synchronization, but cover the full spectrum of dialogue phenomena that are associated with symmetric multimodality (including crossmodal references, one-anaphora, and backchannelling). We show that SmartKom's plup-an-play architecture supports multiple recognizers for a single modality, eg. the user's speech signal can be processed by three unimodal recognizers in parallel (speech recognition, emotional prosody, boundary prosody). Finally, we detail SmartKom's three-tiered representation of multimodal discourse, consisting of a domain layer, a discourse layer, and a modality layer.
机译:我们介绍对话系统的对称多模的概念,其中所有输入模式(例如,语音,手势,面部表情)也可用于输出,反之亦然。具有对称多模式的对话系统不仅必须理解和代表用户的多模式输入,而且不仅可以理解和代表其自己的多模式输出。我们介绍了SmartKom系统,该系统提供了一个具有体现的会话系统的混合主动对话系统中的完全对称的多模。 SmartKom代表了新一代的多模式对话系统,不仅涉及简单的模态集成和同步,而且涵盖与对称多模(包括跨型参考,单个Anaphora和Backhonnelling)相关的对话现象。我们展示SmartKOM的Plup-An-Plaster架构支持多个识别器,例如单个模态,例如。用户的语音信号可以并行地由三个单峰识别器处理(语音识别,情绪韵律,边界韵律)。最后,我们详细介绍了SmartKOM的三层话语的三层表示,由域层,话语层和模态层组成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号