首页> 外文会议>International conference on web-age information management >Top-Down SLCA Computation Based on Hash Search
【24h】

Top-Down SLCA Computation Based on Hash Search

机译:基于哈希搜索的自顶向下SLCA计算

获取原文

摘要

In this paper, we focus on efficient processing of a given XML keyword query based on SLCA semantics. We assign to each node an ID that equals to its visiting order when traversing the XML document in deep-first order, based on which we construct two kinds of indexes. The first index is an inverted list L of IDDewey labels for each keyword k, where each IDDewey label l ∈ L represents a node v that directly contains k, l consists of node IDs corresponding to all nodes on the path from the document root to v. The second index is a hash table, which records, for each pair of node v and keyword k, the number of occurrence of k in the subtree rooted at v. Based on the two indexes, we propose an algorithm, namely TDHS, that takes the shortest inverted IDDewey label list as the working list and computes all SLCA results in a top-down manner based on hash search. Compared with existing methods, our method achieves the worst case time complexity of O(m · |L_1~(ID)|) for a given keyword query Q, where |L_1~(ID)| is the number of distinct node IDs in the shortest inverted IDDewey label list of Q. Our experimental results verify the performance advantages of our method according to various evaluation metrics.
机译:在本文中,我们专注于基于SLCA语义的给定XML关键字查询的有效处理。当以深度优先顺序遍历XML文档时,我们为每个节点分配一个等于其访问顺序的ID,并以此为基础构造两种索引。第一个索引是每个关键字k的IDDewey标签的反向列表L,其中每个IDDewey标签l∈L代表直接包含k的节点v,l由对应于从文档根到v路径上所有节点的节点ID组成第二个索引是一个哈希表,它针对节点v和关键字k的每对记录以v为根的子树中k的出现次数。基于这两个索引,我们提出了一种算法,即TDHS将最短的反向IDDewey标签列表作为工作列表,并基于哈希搜索以自上而下的方式计算所有SLCA结果。与现有方法相比,对于给定的关键字查询Q,我们的方法实现了O(m·| L_1〜(ID)|)的最坏情况下的时间复杂度,其中| L_1〜(ID)|是Q的最短反向IDDewey标签列表中不同节点ID的数量。我们的实验结果根据各种评估指标证明了我们方法的性能优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号