首页> 外文OA文献 >Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction
【2h】

Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction

机译:收集和总结用户生成的内容,用于基于语音的高级人机交互

摘要

There have been many assistant applications on mobile devices, which could help people obtain rich Web content such as user-generated data (e.g., reviews, posts, blogs, and tweets). However, online communities and social networks are expanding rapidly and it is impossible for people to browse and digest all the information via simple search interface. To help users obtain information more efficiently, both the interface for data access and the information representation need to be improved. An intuitive and personalized interface, such as a dialogue system, could be an ideal assistant, which engages a user in a continuous dialogue to garner the user's interest and capture the user's intent, and assists the user via speech-navigated interactions. In addition, there is a great need for a type of application that can harvest data from the Web, summarize the information in a concise manner, and present it in an aggregated yet natural way such as direct human dialogue. This thesis, therefore, aims to conduct research on a universal framework for developing speech-based interface that can aggregate user-generated Web content and present the summarized information via speech-based human-computer interaction. To accomplish this goal, several challenges must be met. Firstly, how to interpret users' intention from their spoken input correctly? Secondly, how to interpret the semantics and sentiment of user-generated data and aggregate them into structured yet concise summaries? Lastly, how to develop a dialogue modeling mechanism to handle discourse and present the highlighted information via natural language? This thesis explores plausible approaches to tackle these challenges. We will explore a lexicon modeling approach for semantic tagging to improve spoken language understanding and query interpretation. We will investigate a parse-and-paraphrase paradigm and a sentiment scoring mechanism for information extraction from unstructured user-generated data. We will also explore sentiment-involved dialogue modeling and corpus-based language generation approaches for dialogue and discourse. Multilingual prototype systems in multiple domains have been implemented for demonstration.
机译:移动设备上有许多辅助应用程序,可以帮助人们获得丰富的Web内容,例如用户生成的数据(例如评论,帖子,博客和推文)。但是,在线社区和社交网络正在迅速扩展,人们无法通过简单的搜索界面浏览和消化所有信息。为了帮助用户更有效地获取信息,需要同时改进数据访问界面和信息表示形式。直观而个性化的界面(例如对话系统)可以是理想的助手,它可以使用户参与持续的对话,以引起用户的兴趣并捕获用户的意图,并通过语音导航交互来帮助用户。另外,迫切需要一种应用程序,它可以从Web上收集数据,以简洁的方式汇总信息,并以聚合但自然的方式(例如直接的人类对话)呈现信息。因此,本论文旨在研究一种通用的框架,以开发基于语音的界面,该界面可以聚合用户生成的Web内容并通过基于语音的人机交互来呈现摘要信息。为了实现这一目标,必须应对几个挑战。首先,如何正确地从用户的语音输入中解释用户的意图?其次,如何解释用户生成的数据的语义和情感,并将其聚合为结构化但简洁的摘要?最后,如何开发对话建模机制来处理话语并通过自然语言呈现突出显示的信息?本文探讨了解决这些挑战的可行方法。我们将探索用于语义标记的词典建模方法,以提高对口语的理解和查询解释。我们将研究从非结构化用户生成的数据中提取信息的解析和释义范式和情感评分机制。我们还将探讨涉及情感的对话建模和基于语料库的语言生成方法,以进行对话和话语。已经实现了多个领域的多语言原型系统的演示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号