首页> 外文会议>IEEE International Conference on Semantic Computing >A Mighty Dataset for Stress-Testing Question Answering Systems
【24h】

A Mighty Dataset for Stress-Testing Question Answering Systems

机译:压力测试问答系统的强大数据集

获取原文

摘要

The general goal of semantic question answering systems is to provide correct answers to natural language queries, given a number of structured datasets. The increasing broad deployment of question answering (QA) systems in everyday life requires a comparable and reliable rating of how well QA systems perform and how scalable they are. In order to achieve this, we developed a massive dataset of more than 2 million natural language questions and their SPARQL queries for the DBpedia dataset. We combined natural language processing and linked open data to automatically generate this large amount of valid question-query pairs. Our aim is to assist the benchmarking or scoring of QA systems in terms of answering questions in a range of languages, retrieving answers from heterogeneous sources or answering massive amounts of questions within a limited time. This dataset represents an ideal choice for stress-testing systems' scalability, speed and correctness. As such it has already been included into the Large-scale QA task of the Question Answering Over Linked Data (QALD) Challenge and the HOBBIT project Question Answering Benchmark.
机译:给定许多结构化数据集,语义问答系统的总体目标是为自然语言查询提供正确的答案。问答系统(QA)在日常生活中的广泛部署要求对QA系统的性能和可伸缩性进行可比且可靠的评估。为了实现这一目标,我们开发了一个庞大的数据集,包含超过200万个自然语言问题及其针对DBpedia数据集的SPARQL查询。我们结合了自然语言处理和链接的开放数据,以自动生成大量有效的问题查询对。我们的目标是帮助QA系统进行基准测试或评分,以多种语言回答问题,从异构来源检索答案或在有限的时间内回答大量问题。该数据集是压力测试系统的可伸缩性,速度和正确性的理想选择。因此,它已被包含在链接数据问答(QALD)挑战和HOBBIT项目问答基准的大规模QA任务中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号