首页> 外文期刊>Journal of web semantics: >SINA: Semantic interpretation of user queries for question answering on interlinked data
【24h】

SINA: Semantic interpretation of user queries for question answering on interlinked data

机译:新浪:对互连数据进行问答的用户查询的语义解释

获取原文
获取原文并翻译 | 示例

摘要

The architectural choices underlying Linked Data have led to a compendium of data sources which contain both duplicated and fragmented information on a large number of domains. One way to enable non-experts users to access this data compendium is to provide keyword search frameworks that can capitalize on the inherent characteristics of Linked Data. Developing such systems is challenging for three main reasons. First, resources across different datasets or even within the same dataset can be homonyms. Second, different datasets employ heterogeneous schemas and each one may only contain a part of the answer for a certain user query. Finally, constructing a federated formal query from keywords across different datasets requires exploiting links between the different datasets on both the schema and instance levels. We present Sina, a scalable keyword search system that can answer user queries by transforming user-supplied keywords or natural-languages queries into conjunctive SPARQL queries over a set of interlinked data sources. Sina uses a hidden Markov model to determine the most suitable resources for a user-supplied query from different datasets. Moreover, our framework is able to construct federated queries by using the disambiguated resources and leveraging the link structure underlying the datasets to query. We evaluate Sina over three different datasets. We can answer 25 queries from the QALD-1 correctly. Moreover, we perform as well as the best question answering system from the QALD-3 competition by answering 32 questions correctly while also being able to answer queries on distributed sources. We study the runtime of SINA in its mono-core and parallel implementations and draw preliminary conclusions on the scalability of keyword search on Linked Data.
机译:链接数据背后的体系结构选择导致了数据源的汇总,其中包含大量域上的重复信息和零碎信息。使非专家用户可以访问此数据纲要的一种方法是提供可以利用链接数据的固有特性的关键字搜索框架。出于以下三个主要原因,开发此类系统具有挑战性。首先,跨不同数据集甚至同一数据集中的资源可以是同音异义词。其次,不同的数据集采用异构模式,每个数据集可能仅包含针对某个用户查询的部分答案。最后,根据跨不同数据集的关键字构造联合形式的正式查询需要在架构和实例级别上利用不同数据集之间的链接。我们介绍了Sina,这是一个可扩展的关键字搜索系统,它可以通过将一组互连数据源上的用户提供的关键字或自然语言查询转换为SPARQL联合查询来回答用户查询。新浪使用隐藏的马尔可夫模型来确定来自不同数据集的用户查询的最合适资源。此外,我们的框架能够通过使用明确的资源并利用数据集基础的链接结构来构造联合查询。我们通过三个不同的数据集对新浪进行评估。我们可以正确回答QALD-1的25个查询。此外,在正确回答32个问题的同时,我们还能提供QALD-3竞赛中最佳的问题回答系统,同时还能够回答分布式资源上的问题。我们在单核和并行实现中研究了SINA的运行时,并就链接数据上关键字搜索的可扩展性得出了初步结论。

著录项

  • 来源
    《Journal of web semantics:》 |2015年第1期|39-51|共13页
  • 作者单位

    Department of Computer Science, IFI/AKSW, Universitaet Leipzig, Germany,Department of Enterprise Information Systems (EIS), Institute for Applied Computer Science at University of Bonn, Germany;

    Department of Enterprise Information Systems (EIS), Institute for Applied Computer Science at University of Bonn, Germany;

    Department of Enterprise Information Systems (EIS), Institute for Applied Computer Science at University of Bonn, Germany;

    Department of Computer Science, IFI/AKSW, Universitaet Leipzig, Germany,Department of Enterprise Information Systems (EIS), Institute for Applied Computer Science at University of Bonn, Germany,Fraunhofer Institute for Intelligent Analysis and Information Systems, Bonn, Germany;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Keyword search; Question answering; Hidden Markov model; SPARQL; RDF; Disambiguation;

    机译:关键词搜索;问题回答;隐马尔可夫模型;SPARQL;RDF;消歧;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号