首页> 外文会议>IEEE International Conference on Semantic Computing >A Mighty Dataset for Stress-Testing Question Answering Systems
【24h】

A Mighty Dataset for Stress-Testing Question Answering Systems

机译:用于压力测试问题应答系统的强大数据集

获取原文
获取外文期刊封面目录资料

摘要

The general goal of semantic question answering systems is to provide correct answers to natural language queries, given a number of structured datasets. The increasing broad deployment of question answering (QA) systems in everyday life requires a comparable and reliable rating of how well QA systems perform and how scalable they are. In order to achieve this, we developed a massive dataset of more than 2 million natural language questions and their SPARQL queries for the DBpedia dataset. We combined natural language processing and linked open data to automatically generate this large amount of valid question-query pairs. Our aim is to assist the benchmarking or scoring of QA systems in terms of answering questions in a range of languages, retrieving answers from heterogeneous sources or answering massive amounts of questions within a limited time. This dataset represents an ideal choice for stress-testing systems' scalability, speed and correctness. As such it has already been included into the Large-scale QA task of the Question Answering Over Linked Data (QALD) Challenge and the HOBBIT project Question Answering Benchmark.
机译:语义问题应答系统的一般目标是给定自然语言查询的正确答案,因为许多结构化数据集。日常生活中越来越广泛的问题答案(QA)系统的越来越多地需要一种可比且可靠的评级,QA系统的表现如何以及它们是如何可扩展的。为了实现这一目标,我们为DBPedia DataSet开发了200多万自然语言问题的大量数据集及其SPARQL查询。我们组合自然语言处理和链接的开放数据,以自动生成大量有效的问题查询对。我们的目的是在回答一系列语言中回答问题的基准或评分,从异质来源检索答案或在有限时间内回答大量问题。该数据集代表压力测试系统的可扩展性,速度和正确性的理想选择。因此,它已被列入回答链接数据(QALD)挑战和Hobbit项目问题接听基准的问题的大规模QA任务中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号