首页> 外国专利> A LOOK FOR UNDERSTANDING SPEAKING LANGUAGES IN MULTIMODAL DIALOGUE INTERACTIONS

A LOOK FOR UNDERSTANDING SPEAKING LANGUAGES IN MULTIMODAL DIALOGUE INTERACTIONS

机译:多模态对话交互中理解语音语言的外观

摘要

Improving accuracy in understanding and/or resolving references to visual elements in a visual context associated with a computerized conversational system is described. Techniques described herein leverage gaze input with gestures and/or speech input to improve spoken language understanding in computerized conversational systems. Leveraging gaze input and speech input improves spoken language understanding in conversational systems by improving the accuracy by which the system can resolve references—or interpret a user's intent—with respect to visual elements in a visual context. In at least one example, the techniques herein describe tracking gaze to generate gaze input, recognizing speech input, and extracting gaze features and lexical features from the user input. Based at least in part on the gaze features and lexical features, user utterances directed to visual elements in a visual context can be resolved.
机译:描述了在理解和/或解决对与计算机对话系统相关联的视觉环境中对视觉元素的引用方面的准确性。本文描述的技术利用注视输入和手势和/或语音输入来改善计算机化对话系统中的口语理解。利用凝视输入和语音输入,可以提高对话系统中相对于视觉环境中的视觉元素可以解析参考(或解释用户意图)的准确性,从而提高对口语的理解。在至少一个示例中,本文的技术描述了跟踪注视以生成注视输入,识别语音输入以及从用户输入中提取注视特征和词汇特征。至少部分地基于凝视特征和词汇特征,可以解决针对视觉上下文中的视觉元素的用户话语。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号