首页>
外国专利>
Eye Gaze for Spoken Language Understanding in Multi-Modal Conversational Interactions
Eye Gaze for Spoken Language Understanding in Multi-Modal Conversational Interactions
展开▼
机译:多模式对话互动中的口语理解能力
展开▼
页面导航
摘要
著录项
相似文献
摘要
Improving accuracy in understanding and/or resolving references to visual elements in a visual context associated with a computerized conversational system is described. Techniques described herein leverage gaze input with gestures and/or speech input to improve spoken language understanding in computerized conversational systems. Leveraging gaze input and speech input improves spoken language understanding in conversational systems by improving the accuracy by which the system can resolve references—or interpret a user's intent—with respect to visual elements in a visual context. In at least one example, the techniques herein describe tracking gaze to generate gaze input, recognizing speech input, and extracting gaze features and lexical features from the user input. Based at least in part on the gaze features and lexical features, user utterances directed to visual elements in a visual context can be resolved.
展开▼