首页> 外文会议>Conference on empirical methods in natural language processing >WikiQA: A Challenge Dataset for Open-Domain Question Answering
【24h】

WikiQA: A Challenge Dataset for Open-Domain Question Answering

机译:WikiQA:开放域问答的挑战数据集

获取原文

摘要

We describe the WikiQA dataset, a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering. Most previous work on answer sentence selection focuses on a dataset created using the TREC-QA data, which includes editor-generated questions and candidate answer sentences selected by matching content words in the question. WikiQA is constructed using a more natural process and is more than an order of magnitude larger than the previous dataset. In addition, the WikiQA dataset also includes questions for which there are no correct sentences, enabling researchers to work on answer triggering, a critical component in any QA system. We compare several systems on the task of answer sentence selection on both datasets and also describe the performance of a system on the problem of answer triggering using the WikiQA dataset.
机译:我们描述了WikiQA数据集,这是一组新的公开可用的问题和句子对,它被收集并注释以用于开放域问题解答的研究。以前有关答案句选择的大多数工作都集中在使用TREC-QA数据创建的数据集上,该数据集包括编辑器生成的问题和通过匹配问题中的内容词而选择的候选答案句。 WikiQA使用更自然的过程构建,并且比以前的数据集大了一个数量级。此外,WikiQA数据集还包含没有正确句子的问题,使研究人员能够进行答案触发,这是任何QA系统中的关键组成部分。我们比较了这两个数据集上选择答案句子任务的几种系统,并描述了使用WikiQA数据集回答答案问题时系统的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号