首页> 外文学位 >Integration, Provenance, and Temporal Queries for Large-Scale Knowledge Bases.
【24h】

Integration, Provenance, and Temporal Queries for Large-Scale Knowledge Bases.

机译:大型知识库的集成,出处和时间查询。

获取原文
获取原文并翻译 | 示例

摘要

Knowledge bases that summarize web information in RDF triples deliver many benefits, including support for natural language question answering and powerful structured queries that extract encyclopedic knowledge via SPARQL. Large scale knowledge bases grow rapidly in terms of scale and significance, and undergo frequent changes in both schema and content. Two critical problems have thus emerged: (i) how to support temporal queries that explore the history of knowledge bases or flash-back to the past; (ii) how to integrate knowledge from difference sources and improve the quality of integrated knowledge base while preserving the provenance information. In this dissertation, we propose a framework that supports knowledge integration, temporal query evaluation and user-friendly interfaces for large-scale knowledge bases. Towards this goal, we make the following contributions:;(i) We propose SPARQLT, a temporal extension of structured query language SPARQL based on a point temporal model which simplifies the expression of temporal joins and eliminates the need for temporal coalescing. This approach makes possible an end-user interface HKB (Historical Knowledge Browser) where users can browse the evolution history of knowledge bases and express historical queries via simple by-example conditions in the infoboxes of Wikipedia pages.;(ii) We have designed and implemented RDF-TX (RDF Temporal eXpress), an efficient system for managing temporal RDF data and evaluating SPARQL T queries. RDF-TX takes advantage of compressed Multiversion B+ trees to achieve fast evaluation of temporal queries. The experimental result demonstrates that our indexing and query optimization techniques deliver superior performance over other systems.;(iii) We propose a framework for knowledge extraction and integration. We first introduce IBMiner, a novel NLP-based system that derives knowledge bases from free text and preserves the provenance of extracted triples. IBminer uses a deep NLP-based approach to extract subject-attribute-value triples from free text, and maps the attributes to those introduced in existing knowledge bases. Then we integrate public knowledge bases with the knowledge base generated by IBMiner into one of superior quality and coverage, called IKBStore. User-friendly interfaces are provided to manage the knowledge in IKBStore while maintaining provenance information.
机译:以RDF三元组总结Web信息的知识库具有许多优点,包括对自然语言问答的支持以及通过SPARQL提取百科全书知识的强大结构化查询。大规模知识库在规模和重要性方面迅速增长,并且在图式和内容上都经常发生变化。因此出现了两个关键问题:(i)如何支持探索知识库历史或回溯到过去的时间查询; (ii)如何在保留来源信息的同时整合来自不同来源的知识并提高综合知识库的质量。本文提出了一个支持大规模知识库的知识集成,时态查询评估和用户友好界面的框架。为实现这一目标,我们做出了以下贡献:(i)我们提出了SPARQLT,这是一种基于点时态模型的结构化查询语言SPARQL的时态扩展,它简化了时态联接的表达并消除了时态合并的需要。这种方法使最终用户界面HKB(历史知识浏览器)成为可能,在该界面中,用户可以浏览知识库的演变历史,并通过Wikipedia页面的信息框中的简单示例性条件来表达历史查询。(ii)我们已经设计并实施了RDF-TX(RDF Temporal eXpress),这是一个有效的系统,用于管理时态RDF数据和评估SPARQL T查询。 RDF-TX利用压缩的Multiversion B +树来实现对时间查询的快速评估。实验结果表明,我们的索引和查询优化技术提供了优于其他系统的性能。(iii)我们提出了知识提取和集成的框架。我们首先介绍IBMiner,这是一个基于NLP的新颖系统,该系统从自由文本中获取知识库并保留提取的三元组的来源。 IBminer使用一种基于NLP的深层方法从自由文本中提取主题属性值三元组,并将属性映射到现有知识库中引入的属性。然后,我们将公共知识库与IBMiner生成的知识库集成到质量和覆盖率都很高的一种库中,称为IKBStore。提供了用户友好的界面来管理IKBStore中的知识,同时保持来源信息。

著录项

  • 作者

    Gao, Shi.;

  • 作者单位

    University of California, Los Angeles.;

  • 授予单位 University of California, Los Angeles.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 111 p.
  • 总页数 111
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号