A Mighty Dataset for Stress-Testing Question Answering Systems

机译：用于压力测试问题应答系统的强大数据集

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The general goal of semantic question answering systems is to provide correct answers to natural language queries, given a number of structured datasets. The increasing broad deployment of question answering (QA) systems in everyday life requires a comparable and reliable rating of how well QA systems perform and how scalable they are. In order to achieve this, we developed a massive dataset of more than 2 million natural language questions and their SPARQL queries for the DBpedia dataset. We combined natural language processing and linked open data to automatically generate this large amount of valid question-query pairs. Our aim is to assist the benchmarking or scoring of QA systems in terms of answering questions in a range of languages, retrieving answers from heterogeneous sources or answering massive amounts of questions within a limited time. This dataset represents an ideal choice for stress-testing systems' scalability, speed and correctness. As such it has already been included into the Large-scale QA task of the Question Answering Over Linked Data (QALD) Challenge and the HOBBIT project Question Answering Benchmark.

机译：语义问题应答系统的一般目标是给定自然语言查询的正确答案，因为许多结构化数据集。日常生活中越来越广泛的问题答案（QA）系统的越来越多地需要一种可比且可靠的评级，QA系统的表现如何以及它们是如何可扩展的。为了实现这一目标，我们为DBPedia DataSet开发了200多万自然语言问题的大量数据集及其SPARQL查询。我们组合自然语言处理和链接的开放数据，以自动生成大量有效的问题查询对。我们的目的是在回答一系列语言中回答问题的基准或评分，从异质来源检索答案或在有限时间内回答大量问题。该数据集代表压力测试系统的可扩展性，速度和正确性的理想选择。因此，它已被列入回答链接数据（QALD）挑战和Hobbit项目问题接听基准的问题的大规模QA任务中。

著录项

来源
《IEEE International Conference on Semantic Computing》|2018年|419p|共4页
会议地点
作者
Bastian Haarmann; Claudio Martens; Henning Petzka; Giulio Napolitano;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP301-53;
关键词
Knowledge discovery; Benchmark testing; Natural languages; Task analysis; Resource description framework; Standards; Linked data;

机译：知识发现;基准测试;自然语言;任务分析;资源描述框架;标准;链接数据;

相似文献

外文文献
中文文献
专利

1. DAWQAS: A Dataset for Arabic Why Question Answering System [J] . Walaa Saber Ismail, Masun Nabhan Homsi Procedia Computer Science . 2018,第1期

机译：DAWQAS：阿拉伯语为什么问答系统的数据集
2. Visual question answering: Datasets, algorithms, and future challenges [J] . Kushal Kafle, Christopher Kanan Computer vision and image understanding . 2017,第octa期

机译：视觉问题解答：数据集，算法和未来挑战
3. Visual question answering: A survey of methods and datasets [J] . Qi Wu, Damien Teney, Peng Wang, Computer vision and image understanding . 2017,第octa期

机译：视觉问题解答：方法和数据集调查
4. A Mighty Dataset for Stress-Testing Question Answering Systems [C] . Bastian Haarmann, Claudio Martens, Henning Petzka, IEEE International Conference on Semantic Computing . 2018

机译：压力测试问答系统的强大数据集
5. Automatic Neural Question Generation Using Community-Based Question Answering Systems [D] . Baghaee, Tina. 2018

机译：使用基于社区的问题应答系统的自动神经问题
6. Towards spoken clinical-question answering: evaluating and adapting automatic speech-recognition systems for spoken clinical questions [O] . Feifan Liu, Gokhan Tur, Dilek Hakkani-Tür, 2011

机译：走向口语临床问题的答案：针对口语临床问题评估和改编自动语音识别系统
7. An Adaption of BIOASQ Question Answering dataset for Machine Reading systems by Manual Annotations of Answer Spans. [O] . Sanjay Kamath, Brigitte Grau, Yue Ma 2018

机译：通过手动注释答案跨度的Bioasq问题对机器阅读系统的数据集进行了自适应数据集。
8. Questions and Answers on Quality, the ISO 9000 Standard Series, Quality SystemRegistration, and Related Issues. More Questions and Answers on the ISO 9000 Standard Series and Related Issues [R] . Breitenberg, M. 1993

机译：有关质量的问题和解答，IsO 9000标准系列，质量体系注册和相关问题。有关IsO 9000标准系列及相关问题的更多问题和解答

A Mighty Dataset for Stress-Testing Question Answering Systems

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅