首页> 外国专利> Eye Gaze for Spoken Language Understanding in Multi-Modal Conversational Interactions

Eye Gaze for Spoken Language Understanding in Multi-Modal Conversational Interactions

机译：多模式对话互动中的口语理解能力

页面导航

摘要
著录项
相似文献

摘要

Improving accuracy in understanding and/or resolving references to visual elements in a visual context associated with a computerized conversational system is described. Techniques described herein leverage gaze input with gestures and/or speech input to improve spoken language understanding in computerized conversational systems. Leveraging gaze input and speech input improves spoken language understanding in conversational systems by improving the accuracy by which the system can resolve references—or interpret a user's intent—with respect to visual elements in a visual context. In at least one example, the techniques herein describe tracking gaze to generate gaze input, recognizing speech input, and extracting gaze features and lexical features from the user input. Based at least in part on the gaze features and lexical features, user utterances directed to visual elements in a visual context can be resolved.

机译：描述了在理解和/或解决对与计算机对话系统相关联的视觉环境中对视觉元素的引用方面的准确性。本文描述的技术利用注视输入和手势和/或语音输入来改善计算机化对话系统中的口语理解。利用凝视输入和语音输入，可以提高对话系统中相对于视觉环境中的视觉元素可以解析参考（或解释用户意图）的准确性，从而提高对口语的理解。在至少一个示例中，本文的技术描述了跟踪注视以生成注视输入，识别语音输入以及从用户输入中提取注视特征和词汇特征。至少部分地基于凝视特征和词汇特征，可以解决针对视觉上下文中的视觉元素的用户话语。

著录项

公开/公告号US2019391640A1

专利类型
公开/公告日2019-12-26

原文格式PDF
申请/专利权人 MICROSOFT TECHNOLOGY LICENSING LLC;
展开▼

申请/专利号US201916399414
发明设计人 ANNA PROKOFIEVA;FETHIYE ASLI CELIKYILMAZ;DILEK Z HAKKANI-TUR;LARRY HECK;MALCOM SLANEY;
展开▼

申请日2019-04-30
分类号G06F3/01;G02B27;G10L17/22;G10L15/08;G10L15;G06K9;G06F3/16;
国家 US
入库时间 2022-08-21 11:22:22

相似文献

专利
外文文献
中文文献