首页> 外文会议>International Conference on Extending Database Technology(EDBT 2006); 20060326-31; Munich(DE) >DPTree: A Distributed Pattern Tree Index for Partial-Match Queries in Peer-to-Peer Networks
【24h】

DPTree: A Distributed Pattern Tree Index for Partial-Match Queries in Peer-to-Peer Networks

机译:DPTree:对等网络中部分匹配查询的分布式模式树索引

获取原文
获取原文并翻译 | 示例

摘要

Partial-match queries return data items that contain a subset of the query keywords and order the results based on the statistical properties of the matched keywords. They are essential for information retrieval on large document repositories. However, most current peer-to-peer networks for information retrieval are based on distributed hashing and as such cannot support partial-match queries efficiently. In this paper, we describe an efficient and scalable technique to support partial-match queries on peer-to-peer networks. We observe that the combinations of keywords in the queries are only a small subset of all possible combinations of the keywords in the documents. Therefore, we propose a distributed index structure, called a distributed pattern tree (DPTree), to record frequent query patterns, i.e., combinations of keywords, learnt from the query history at each node in the network. Using this index, a query can identify its best matching patterns quickly and data lookup can be done in logarithmic time with respect to the network size. Our simulation studies on the TREC data sets have shown promising results in comparison with other previous approaches.
机译:部分匹配查询返回包含查询关键字子集的数据项,并根据匹配关键字的统计属性对结果进行排序。它们对于大型文档存储库中的信息检索至关重要。但是,当前大多数用于信息检索的对等网络都基于分布式哈希,因此不能有效地支持部分匹配查询。在本文中,我们描述了一种高效且可扩展的技术,以支持对等网络上的部分匹配查询。我们观察到查询中关键字的组合只是文档中关键字所有可能组合的一小部分。因此,我们提出了一种分布式索引结构,称为分布式模式树(DPTree),以记录频繁查询模式,即从网络中每个节点的查询历史中获悉的关键字组合。使用该索引,查询可以快速确定其最佳匹配模式,并且可以在相对于网络大小的对数时间内完成数据查找。与其他以前的方法相比,我们对TREC数据集的仿真研究已显示出令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号