Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language

机译：对Web提取数据的表现力和灵活访问：基于关键字的结构化查询语言

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands or millions. Formulating information needs with conventional structured query languages is difficult due to the sheer size of schema information available to the user. We address this challenge by proposing a new query language that blends keyword search with structured query processing over large information graphs with rich semantics. Our formalism for structured queries based on keywords combines the flexibility of keyword search with the expressiveness of structures queries. We propose a solution to the resulting disambiguation problem caused by introducing keywords as primitives in a structured query language. We show how expressions in our proposed language can be rewritten using the vocabulary of the web-extracted KB, and how different possible rewritings can be ranked based on their syntactic relationship to the keywords in the query as well as their semantic coherence in the underlying KB. An extensive experimental study demonstrates the efficiency and effectiveness of our approach. Additionally, we show how our query language fits into QUICK, an end-to-end information system that integrates web-extracted data graphs with full-text search. In this system, the rewritten query describes an arbitrary topic of interest for which corresponding entities, and documents relevant to the entities, are efficiently retrieved.

机译：自动提取来自Web源的结构化数据通常会导致大型异构知识库（KB），数据和架构项目编号数十万或百万。由于用户可用的架构信息的庞大大小，使用传统的结构化查询语言配制信息需求。我们通过提出使用富裕语义的大型信息图表中使用结构化查询处理的新查询语言来解决这一挑战。我们基于关键字的结构化查询的形式主义将关键字搜索的灵活性与结构查询的富有效果相结合。我们提出了通过在结构化查询语言中引入关键字作为基元引起的产生歧义问题的解决方案。我们展示了我们提出的语言中的表达式如何使用Web提取的KB的词汇重写，以及如何根据其与查询中的关键字的句法关系来排序不同的可能重写以及其基础KB中的语义相干关系。一个广泛的实验研究表明了我们方法的效率和有效性。此外，我们展示了查询语言如何快速设计，即结束信息系统，其与全文搜索集成了Web提取的数据图。在该系统中，重写的查询描述了有趣的兴趣主题，对应于实体和与实体相关的文档进行有效地检索。

著录项

来源
《ACM SIGMOD international conference on management of data》|2010年||共12页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词
design; experimentation; languages; performance;

机译：设计;实验;语言;表现;

相似文献

外文文献
中文文献
专利

1. Expressive Languages for Path Queries over Graph-Structured Data [J] . PABLO BARCELO, LEONID LIBKIN, ANTHONY W. LIN, ACM transactions on database systems . 2012,第4期

机译：图结构化数据上的路径查询的表达语言
2. Highly Expressive Query Languages for Unordered Data Trees [J] . Abiteboul Serge, Bourhis Pierre, Vianu Victor Theory of computing systems . 2015,第4期

机译：用于无序数据树的高度表达查询语言
3. On the expressiveness of linear-constraint query languages for spatial databases [J] . Vandeurzen L., Van Gucht D., Gyssens M. Theoretical computer science . 2001,第1a2期

机译：空间数据库线性约束查询语言的表达性
4. Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language [C] . Jeffrey Pound, Ihab F. llyas, Grant Weddell ACM SIGMOD international conference on management of data;SIGMOD 2010 . 2010

机译：表达和灵活访问Web提取的数据：基于关键字的结构化查询语言
5. Flexible query facilities for heterogeneous semi-structured data [D] . Li, Yunyao 2007

机译：异构半结构化数据的灵活查询工具
6. Medical Query Language: Improved Access to MUMPS Databases [O] . Sally Webster, Mary Morgan, G. Octo Barnett 1987

机译：医学查询语言：改进的对MUMPS数据库的访问
7. Expressive languages for path queries over graph-structured data [O] . Barcelo, P., Libkin, L., Lin, A.W., 2012

机译：用于图结构数据的路径查询的表达语言

Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language

摘要

著录项

相似文献

相关主题

期刊订阅