首页> 外文学位 >Instance-based question answering.
【24h】

Instance-based question answering.

机译:基于实例的问题解答。

获取原文
获取原文并翻译 | 示例

摘要

During recent years, question answering (QA) has grown from simple passage retrieval and information extraction to very complex approaches that incorporate deep question and document analysis, reasoning, planning, and sophisticated uses of knowledge resources. Most existing QA systems combine rule-based, knowledge-based and statistical components, and are highly optimized for a particular style of questions in a given language. Typical question answering approaches depend on specific ontologies, resources, processing tools, document sources, and very often rely on expert knowledge and rule-based components. Furthermore, such systems are very difficult to re-train and optimize for different domains and languages, requiring considerable time and human effort.; We present a fully statistical, data-driven, instance-based approach to question answering (IBQA) that learns how to answer new questions from similar training questions and their known correct answers. We represent training questions as points in a multi-dimensional space and cluster them according to different granularity, scatter, and similarity metrics. From each individual cluster we automatically learn an answering strategy for finding answers to questions. When answering a new question that is covered by several clusters, multiple answering strategies are simultaneously employed. The resulting answer confidence combines elements such as each strategy's estimated probability of success, cluster similarity to the new question, cluster size, and cluster granularity. The IBQA approach obtains good performance on factoid and definitional questions, comparable to the performance of top systems participating in official question answering evaluations.; Each answering strategy is cluster-specific and consists of an expected answer model, a query content model, and an answer extraction model. The expected answer model is derived from all training questions in its cluster and takes the form of a distribution over all possible answer types. The query content model for document retrieval is constructed using content from queries that are successful on training questions in that cluster. Finally, we train cluster-specific answer extractors on training data and use them to find answers to new questions.; The IBQA approach is resource non-intensive, but can easily be extended to incorporate knowledge resources or rule-based components. Since it does not rely on hand-written rules, expert knowledge, and manually tuned parameters, it is less dependent on a particular language or domain, allowing for fast re-training with minimum human effort. Under limited data, our implementation of an IBQA system achieves good performance, improves with additional training instances, and is easily trainable and adaptable to new types of data. The IBQA approach provides a principled, robust, and easy to implement base system which constitutes a robust and well performing platform for further domain-specific adaptation.
机译:近年来,问题解答(QA)已从简单的段落检索和信息提取发展成为非常复杂的方法,其中结合了深层的问题和文档分析,推理,计划以及对知识资源的复杂使用。现有的大多数质量检查系统都结合了基于规则,基于知识和统计的组件,并且针对给定语言中的特定类型的问题进行了高度优化。典型的问题解答方法取决于特定的本体,资源,处理工具,文档来源,并且通常依赖于专家知识和基于规则的组件。此外,这种系统很难针对不同的领域和语言进行重新训练和优化,需要大量的时间和人力。我们提出了一种完全统计的,数据驱动的,基于实例的问题解答(IBQA)方法,该方法从相似的培训问题及其已知的正确答案中学习如何回答新问题。我们将训练问题表示为多维空间中的点,并根据不同的粒度,分散性和相似性度量对它们进行聚类。从每个单独的集群中,我们自动学习一种寻找问题答案的回答策略。当回答一个由多个群集覆盖的新问题时,将同时采用多种回答策略。产生的答案置信度结合了以下因素:每种策略的估计成功概率,与新问题的聚类相似性,聚类大小和聚类粒度。 IBQA方法在事实和定义性问题上获得了良好的表现,与参加官方问答回答评估的顶级系统的表现相当。每个应答策略都是特定于群集的,并且由预期的应答模型,查询内容模型和应答提取模型组成。预期答案模型是从其群集中的所有训练问题得出的,并采用所有可能答案类型的分布形式。用于文档检索的查询内容模型是使用来自对该集群中的培训问题进行成功训练的查询内容来构造的。最后,我们在训练数据上训练特定于集群的答案提取器,并使用它们找到新问题的答案。 IBQA方法不占用资源,但可以轻松扩展以合并知识资源或基于规则的组件。由于它不依赖于手写规则,专家知识和手动调整的参数,因此它较少依赖于特定的语言或领域,从而以最少的人力即可快速进行重新培训。在数据有限的情况下,我们实施的IBQA系统可实现良好的性能,并通过额外的培训实例进行改进,并且易于培训和适应新型数据。 IBQA方法提供了一个有原则的,健壮的,易于实施的基础系统,该基础系统构成了一个健壮且性能良好的平台,可用于进一步的特定领域适应。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号