A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers

机译：在研究论文中锚定的信息问题和答案的数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Readers of academic research papers often read with the goal of answering specific questions. Question Answering systems that can answer those questions can make consumption of the content much more efficient. However, building such tools requires data that reflect the difficulty of the task arising from complex reasoning about claims made in multiple parts of a paper. In contrast, existing information-seeking question answering datasets usually contain questions about generic factoid-type information. We therefore present QASPER, a dataset of 5,049 questions over 1,585 Natural Language Processing papers. Each question is written by an NLP practitioner who read only the title and abstract of the corresponding paper, and the question seeks information present in the full text. The questions are then answered by a separate set of NLP practitioners who also provide supporting evidence to answers. We find that existing models that do well on other QA tasks do not perform well on answering these questions, un-derperforming humans by at least 27 F_1 points when answering them from entire papers, motivating further research in document-grounded, information-seeking QA, which our dataset is designed to facilitate.

机译：学术研究论文的读者经常阅读目的是回答具体问题。问题接听系统可以回答这些问题可以使内容的消耗更有效。然而，建立此类工具需要数据反映从复杂原理所产生的任务难以在纸张的多个部分中所制作的主张所产生的任务。相比之下，现有的信息寻求应答数据集通常包含关于通用事件类型信息的问题。因此，我们呈现了Qasper，该数据集5,049个问题超过1,585个自然语言处理文件。每个问题都是由NLP从业者编写的，他们只读了相应论文的标题和摘要，问题旨在寻求全文中存在的信息。然后，问题由一套独立的NLP从业者回答，他们还提供支持证据来答案。我们发现，在其他QA任务上的现有模型，在回答这些问题时，不表现出这些问题，在从整个论文中回答它们时至少27个F_1点，激励进一步研究文档接地，信息寻求QA ，我们的数据集旨在方便。

著录项

来源
《Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2021年|4599-4610|共12页
会议地点
作者
Pradeep Dasigi; Kyle Lo; Iz Beltagy; Arman Cohan; Noah A. Smith; Matt Gardner;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Visual question answering: Datasets, algorithms, and future challenges [J] . Kushal Kafle, Christopher Kanan Computer vision and image understanding . 2017,第octa期

机译：视觉问题解答：数据集，算法和未来挑战
2. Visual question answering: A survey of methods and datasets [J] . Qi Wu, Damien Teney, Peng Wang, Computer vision and image understanding . 2017,第octa期

机译：视觉问题解答：方法和数据集调查
3. Search clicks analysis for discovering temporally anchored questions in community Question Answering [J] . Figueroa Alejandro, Gomez-Pantoja Carlos, Herrera Ignacio Expert Systems with Application . 2016,第May期

机译：搜索点击分析，以发现社区中的时间锚定问题
4. Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets [C] . Patrick Lewis, Pontus Stenetorp, Sebastian Riedel Conference of the European Chapter of the Association for Computational Linguistics . 2021

机译：在开放式域问题应答数据集中的问题和回答测试列车重叠
5. Automatic Neural Question Generation Using Community-Based Question Answering Systems [D] . Baghaee, Tina. 2018

机译：使用基于社区的问题应答系统的自动神经问题
6. Applying deep matching networks to Chinese medical question answering: a study and a dataset [O] . Junqing He, Mingming Fu, Manshu Tu 2019

机译：将深度匹配网络应用于中医问答：一项研究和数据集
7. Characterizing Datasets for Social Visual Question Answering, and the New TinySocial Dataset [O] . Zhanwen Chen, Shiyao Li, Roxanne Rashedi, 2020

机译：用于社会视觉问题应答的数据集和新的TinySocial数据集
8. Fouled Anchors: The CONSTELLATION Question Answered [R] . Wegner, D. M., Ratliff, C. D., Lynaugh, K. 1991

机译：Fulled anchors：回答的CONsTELLaTION问题

A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers

摘要

著录项

相似文献

相关主题

期刊订阅