首页> 外文期刊>Information Systems >Comparing top-k XML lists
【24h】

Comparing top-k XML lists

机译:比较前k个XML列表

获取原文
获取原文并翻译 | 示例
       

摘要

Systems that produce ranked lists of results are abundant. For instance, Web search engines return ranked lists of Web pages. There has been work on distance measure for list permutations, like Kendall tau and Spearman's footrule, as well as extensions to handle top-k lists, which are more common in practice. In addition to ranking whole objects (e.g., Web pages), there is an increasing number of systems that provide keyword search on XML or other semistructured data, and produce ranked lists of XML sub-trees. Unfortunately, previous distance measures are not suitable for ranked lists of sub-trees since they do not account for the possible overlap between the returned sub-trees. That is, two sub-trees differing by a single node would be considered separate objects. In this paper, we present the first distance measures for ranked lists of sub-trees, and show under what conditions these measures are metrics. Furthermore, we present algorithms to efficiently compute these distance measures. Finally, we evaluate and compare the proposed measures on real data using three popular XML keyword proximity search systems.
机译:产生结果排名列表的系统非常丰富。例如,Web搜索引擎返回网页的排名列表。已经进行了列表排列的距离度量的工作,例如Kendall tau和Spearman的脚法,以及用于处理top-k列表的扩展,这在实践中更为常见。除了对整个对象(例如,网页)进行排名之外,越来越多的系统提供对XML或其他半结构化数据的关键字搜索,并生成XML子树的排名列表。不幸的是,先前的距离度量不适用于子树的排名列表,因为它们没有考虑返回的子树之间的可能重叠。也就是说,两个单个节点不同的子树将被视为单独的对象。在本文中,我们提出了用于子树排名列表的第一种距离度量,并显示了在什么条件下这些度量是度量。此外,我们提出了可有效计算这些距离量度的算法。最后,我们使用三种流行的XML关键字邻近搜索系统评估和比较针对实际数据的建议措施。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号