首页> 外文会议>SIGMOD/PODS 2007 >Indexing Dataspaces
【24h】

Indexing Dataspaces

机译:索引数据空间

获取原文

摘要

Dataspaces are collections of heterogeneous and partially unstructured data. Unlike data-integration systems that also offer uniform access to heterogeneous data sources, datas- paces do not assume that all the semantic relationships be- tween sources are known and specified. Much of the user interaction with dataspaces involves exploring the data, and users do not have a single schema to which they can pose queries. Consequently, it is important that queries are al- lowed to specify varying degrees of structure, spanning key- word queries to more structure-aware queries. This paper considers indexing support for queries that combine keywords and structure. We describe several exten- sions to inverted lists to capture structure when it is present. In particular, our extensions incorporate attribute labels, relationships between data items, hierarchies of schema ele- ments, and synonyms among schema elements. We describe experiments showing that our indexing techniques improve query effciency by an order of magnitude compared with alternative approaches, and scale well with the size of the data.
机译:数据空间是异构和部分非结构化数据的集合。与也提供对异构数据源的统一访问的数据集成系统不同,数据步调并不假定源之间的所有语义关系都是已知的并已指定。用户与数据空间的大部分交互都涉及浏览数据,并且用户没有可向其提出查询的单一架构。因此,重要的是允许查询指定不同程度的结构,将关键字查询扩展到更多具有结构意识的查询。本文考虑了对结合了关键字和结构的查询的索引支持。我们描述了倒排列表的一些扩展,以捕获存在的结构。特别是,我们的扩展包含属性标签,数据项之间的关系,模式元素的层次结构以及模式元素之间的同义词。我们描述的实验表明,与其他方法相比,我们的索引技术将查询效率提高了一个数量级,并且可以随数据大小很好地扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号