首页> 外文学位 >Eye gaze for reference resolution in multimodal conversational interfaces .

【24h】

Eye gaze for reference resolution in multimodal conversational interfaces .

机译：多模式对话界面中参考分辨率的眼睛注视。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multimodal conversational interfaces allow users to carry a spoken dialogue with an artificial conversational agent while looking at a graphical display. The dialogue is used to accomplish purposeful tasks. Motivated by previous psycholinguistic findings, this dissertation investigates how eye gaze contributes to automated spoken language understanding in such a setting, specifically focusing on robust reference resolution---a process that identifies the referring expressions in an utterance and determines which entities these expressions refer to. As a part of this investigation we attempt to model user focus of attention during human-machine conversation by utilizing the users' naturally occurring eye gaze. We study which eye gaze and auxiliary visual factors contribute to this model's accuracy. Among the various features extracted from eye gaze, fixation intensity has shown to be the most indicative in reflecting attention. We combine user speech along with this gaze-based attentional model into an integrated reference resolution framework. This framework fuses linguistic, dialogue, domain, and eye gaze information to robustly resolve various kinds of referring expressions that occur during human-machine conversation. Our studies have shown that based on this framework, eye gaze can compensate for limited domain models and dialogue processing capability. We further extend this framework to handle recognized speech input acquired situated dialogue within an immersive virtual environment. We utilize word confusion networks to model the set of alternative speech recognition hypotheses and incorporate confusion networks into the reference resolution framework. The empirical results indicate that incorporating eye gaze significantly improves reference resolution performance, especially when limited domain model information is available to the reference resolution framework. The empirical results also indicate that modeling recognized speech via confusion networks rather than the single best recognition hypothesis leads to better reference resolution performance.

机译：多模式对话界面允许用户在查看图形显示时与人工对话代理进行口头对话。对话用于完成有目的的任务。基于先前的语言学研究发现，本文研究了在这种情况下视线如何有助于自动口语理解，特别是着眼于可靠的参考分辨力-这一过程可以识别发声中的参考表达并确定这些表达所指的实体。作为此调查的一部分，我们尝试通过利用用户自然发生的视线来模拟人机对话期间用户的关注焦点。我们研究哪些眼睛凝视和辅助视觉因素有助于该模型的准确性。在从视线中提取的各种特征中，注视强度显示出最能反映注意力。我们将用户语音与基于注视的注意力模型结合到一个集成的参考解析框架中。该框架融合了语言，对话，领域和视线信息，以可靠地解决人机对话期间出现的各种引用表达。我们的研究表明，基于此框架，视线可以补偿有限的领域模型和对话处理能力。我们进一步扩展了该框架，以处理身临其境的虚拟环境中位于对话中的已识别语音输入。我们利用单词混淆网络对备选语音识别假设进行建模，并将混淆网络纳入参考解析框架。实验结果表明，合并视线会显着改善参考分辨率性能，尤其是当有限域模型信息可用于参考分辨率框架时。实验结果还表明，通过混淆网络而不是单个最佳识别假设对识别的语音进行建模会导致更好的参考分辨率性能。

著录项

作者
Prasov, Zahar.;
展开▼
作者单位

Michigan State University.;

展开▼
授予单位 Michigan State University.;
学科 Psychology Cognitive.;Computer Science.
学位 Ph.D.
年度 2011
页码 168 p.
总页数 168
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Gaze Awareness in Conversational Agents: Estimating a User's Conversational Engagement from Eye Gaze [J] . RYO ISHII, YUKIKO I. NAKANO, TOYOAKI NISHIDA ACM Transactions on Interactive Intelligent Systems . 2013 ,第2期

机译：会话代理中的注视意识：从眼睛注视估计用户的会话参与度
2. Integration Model of Eye—Gaze,Voice and Manual Response in Multimodal User Interface [J] . 王坚计算机科学技术学报：英文版 . 1996 ,第005期

机译：眼部，注视，声音和手动响应在多模式用户界面中的集成模型
3. A Probabilistic Approach to Reference Resolution in Multimodal User Interfaces [J] . Joyce Y. Chai, Pengyu Hong, Michelle X. Zhou Intelligence . 2004 ,第Annual期

机译：多模式用户界面中参考解析的概率方法
4. Between Linguistic Attention and Gaze Fixations in Multimodal Conversational Interfaces [C] . Rui Fang, Joyce Y. Chai, Fernanda Ferreira International conference on multimodal interfaces and workshop on machine learning for multimodal interfaces 2009 . 2009

机译：在多模式对话界面中的语言注意和注视注视之间
5. Multimodal interface integrating eye gaze tracking and speech recognition. [D] . Mahajan, Onkar. 2015

机译：集成了注视跟踪和语音识别功能的多模式界面。
6. From the eyes and the heart: a novel eye-gaze metric that predicts video preferences of a large audience [O] . Christoforos Christoforou, Spyros Christou-Champi, Fofi Constantinidou, -1

机译：从眼睛和心脏：一种新颖的视线指标可预测大量观众的视频偏爱
7. Incorporating Temporal and Semantic Information with Eye Gaze for Automatic Word Acquisition in Multimodal Conversational Systems [O] . Shaolin Qu, Joyce Y. Chai 2009

机译：在多模态会话系统中结合时间和语义信息与眼睛注视自动词语习得

Eye gaze for reference resolution in multimodal conversational interfaces .

摘要

著录项

相似文献

相关主题

期刊订阅