首页> 外文会议>International Conference on Text, Speech and Dialogue >Are You Looking at Me, Are You Talking with Me: Multimodal Classification of the Focus of Attention
【24h】

Are You Looking at Me, Are You Talking with Me: Multimodal Classification of the Focus of Attention

机译:你在看着我,你在和我谈话:多模式分类的关注焦点

获取原文

摘要

Automatic dialogue systems get easily confused if speech is recognized which is not directed to the system. Besides noise or other people's conversation, even the user's utterance can cause difficulties when he is talking to someone else or to himself ("Off-Talk"). In this paper the automatic classification of the user's focus of attention is investigated. In the German SmartWeb project, a mobile device is used to get access to the semantic web. In this scenario, two modalities are provided - speech and video signal. This makes it possible to classify whether a spoken request is addressed to the system or not: with the camera of the mobile device, the user's gaze direction is detected; in the speech signal, prosodic features are analyzed. Encouraging recognition rates of up to 93% are achieved in the speech-only condition. Further improvement is expected from the fusion of the two information sources.
机译:如果识别出语言,则自动对话系统很容易混淆,这没有针对系统。除了噪音或其他人的谈话外,即使用户的话语也会在与别人或他自己交谈时会造成困难(“脱谈”)。本文研究了用户对关注焦点的自动分类。在德国SmartWeb项目中,移动设备用于访问语义Web。在这种情况下,提供了两个模态 - 语音和视频信号。这使得可以对系统进行分类,或者可以对系统进行分类:使用移动设备的相机,检测用户的凝视方向;在语音信号中,分析韵律特征。令人鼓舞的识别率高达93%在唯一的演讲条件下实现。预计两个信息来源的融合将进一步改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号