LCA-based Selection for XML Document Collections

机译：基于LCA的XML文档集合选择

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we address the problem of database selection for XML document collections, that is, given a set of collections and a user query, how to rank the collections based on their goodness to the query. Goodness is determined by the relevance of the documents in the collection to the query. We consider keyword queries and support Lowest Common Ancestor (LCA) semantics for defining query results, where the relevance of each document to a query is determined by properties of the LCA of those nodes in the XML document that contain the query keywords. To avoid evaluating queries against each document in a collection, we propose maintaining in a preprocessing phase, information about the LCAs of all pairs of keywords in a document and use it to approximate the properties of the LCA-based results of a query. To improve storage and processing efficiency, we use appropriate summaries of the LCA information based on Bloom filters. We address both a boolean and a weighted version of the database selection problem. Our experimental results show that our approach incurs low errors in the estimation of the goodness of a collection and provides rankings that are very close to the actual ones.

机译：在本文中，我们解决了XML文档集合的数据库选择问题，即在给定集合集合和用户查询的情况下，如何根据集合对查询的优劣来对集合进行排序。优劣取决于集合中文档与查询的相关性。我们考虑使用关键字查询，并支持用于定义查询结果的最低公共祖先（LCA）语义，其中每个文档与查询的相关性由包含查询关键字的XML文档中那些节点的LCA的属性确定。为了避免评估对集合中每个文档的查询，我们建议在预处理阶段维护有关文档中所有关键字对的LCA的信息，并使用它来近似基于LCA的查询结果的属性。为了提高存储和处理效率，我们使用基于Bloom过滤器的LCA信息的适当摘要。我们解决了数据库选择问题的布尔值和加权版本。我们的实验结果表明，我们的方法在评估集合的优劣时不会产生太大的错误，并且所提供的排名与实际排名非常接近。

著录项

来源
《19th international world wide web conference 2010》|2010年|P.511-520|共10页
会议地点
作者
Georgia Koloniari; Evaggelia Pitoura;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词
algorithms; experimentation; performance;

机译：算法;实验性能;

相似文献

外文文献
中文文献
专利

1. XTRACT: Learning document type descriptors from XML document collections [J] . Minos Garofalakis, Aristides Gionis, Rajeev Rastogi, Data mining and knowledge discovery . 2003,第1期

机译：XTRACT：从XML文档集合中学习文档类型描述符
2. Access Control Framework for XML Document Collections [J] . Goran Sladi??, Branko Milosavljevi??, Zora Konjovi??, Computer Science and Information Systems . 2011,第3期

机译：XML文档集合的访问控制框架
3. Visual exploration and retrieval of XML document collections with the generic system X~2 [J] . Holger Meuss, Klaus U. Schulz, Felix Weigel, International journal on digital libraries . 2005,第1期

机译：使用通用系统X〜2进行可视化探索和XML文档集合的检索
4. LCA-based Selection for XML Document Collections [C] . International world wide web conference . 2010

机译：基于LCA的XML文档集合的选择
5. Data hiding and detection in office open XML (OOXML) documents . [D] . Raffay, Muhammad Ali. 2011

机译：Office Open XML（OOXML）文档中的数据隐藏和检测。
6. Using XML Metadata to Enable the Automatic Generation and Processing of HTML Forms from XML Documents [O] . Anil K. Dubey, Henry C. Chueh 2001

机译：使用XML元数据启用从XML文档自动生成和处理HTML表单的功能
7. LCA-based Selection for XML Document Collections [O] . Georgia Koloniari, Evaggelia Pitoura 2011

机译：基于LCa的XmL文档集合选择
8. Learning to Combine Collection-centric and Document-centric Models for Resource Selection. [R] . Balog, K. 2014

机译：学习结合以集合为中心和以文档为中心的资源选择模型。

LCA-based Selection for XML Document Collections

摘要

著录项

相似文献

相关主题

期刊订阅