首页> 外文会议>Australasian Joint Conference on Artificial Intelligence >Open-Domain Question Answering Framework Using Wikipedia
【24h】

Open-Domain Question Answering Framework Using Wikipedia

机译:使用维基百科的开放式域问题应答框架

获取原文
获取外文期刊封面目录资料

摘要

This paper explores the feasibility of implementing a model for an open domain, automated question and answering framework that leverages Wikipedia's knowledgebase. While Wikipedia implicitly comprises answers to common questions, the disambiguation of natural language and the difficulty of developing an information retrieval process that produces answers with specificity present pertinent challenges. However, observational analysis suggests that it is possible to discount the syntactical and lexical structure of a sentence in contexts where questions contain a specific target entity (words that identify a person, location or organisation) and that correspondingly query a property related to it. To investigate this, we implemented an algorithmic process that extracted the target entity from the question using CRF based named entity recognition (NER) and utilised all remaining words as potential properties. Using DBPedia, an ontological database of Wikipedia's knowledge, we searched for the closest matching property that would produce an answer by applying standardised string matching algorithms including the Levenshtein distance, similar text and Dice's coefficient. Our experimental results illustrate that using Wikipedia as a knowledgebase produces high precision for questions that contain a singular unambiguous entity as the subject, but lowered accuracy for questions where the entity exists as part of the object.
机译:本文探讨了为开放域,自动问题和应答框架实施模型的可行性,这些框架利用维基百科知识库。虽然维基百科隐含地包括常见问题的答案,但自然语言的歧义以及开发信息检索过程的难度,这些过程产生具有特异性的答案存在相关的挑战。然而,观察分析表明,可以在问题中包含特定目标实体的语境中判断句子的语法和词法结构(标识人员,位置或组织的单词),并且相应地查询与之相关的属性。为了调查这一点,我们实现了一种算法过程,它使用基于CRF命名实体识别(NER)从问题中提取目标实体,并利用所有剩余的单词作为潜在属性。使用DBPedia,Wikipedia知识的本体数据库,我们搜索了最接近的匹配属性,通过应用规范化的字符串匹配算法,包括Levenshtein距离,类似的文本和骰子系数来产生答案。我们的实验结果表明,使用Wikipedia作为知识库,为包含单个明确实体的问题产生高精度,而是对实体作为对象的一部分存在的问题,请降低准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号