首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Efficient Top-k Retrieval on Massive Data
【24h】

Efficient Top-k Retrieval on Massive Data

机译:对海量数据的高效Top-k检索

获取原文
获取原文并翻译 | 示例

摘要

In many applications, top- query is an important operation to return a set of interesting points in a potentially huge data space. It is analyzed in this paper that the existing algorithms cannot process top- query on massive data efficiently. This paper proposes a novel table-scan-based T2S algorithm to efficiently compute top- results on massive data. T2S first constructs the presorted table, whose tuples are arranged in the order of the round-robin retrieval on the sorted lists. T2S maintains only fixed number of tuples to compute results. The early termination checking for T2S is presented in this paper, along with the analysis of scan depth. The selective retrieval is devised to skip the tuples in the presorted table which are not top- results. The theoretical analysis proves that selective retrieval can reduce the number of the retrieved tuples significantly. The construction and incremental-update/batch-processing methods for the used structures are proposed in this paper. The extensive experimental results, conducted on synthetic and real-life data sets, show that T2S has a significant advantage over the existing algorithms.
机译:在许多应用程序中,顶部查询是一项重要操作,可以在潜在的巨大数据空间中返回一组有趣的点。本文分析了现有算法不能有效地处理海量数据的顶级查询。本文提出了一种新颖的基于表格扫描的T2S算法,可以有效地计算海量数据上的最佳结果。 T2S首先构造预排序表,其元组在循环排序列表上按循环检索的顺序排列。 T2S仅维护固定数量的元组来计算结果。本文介绍了T2S的早期终止检查,以及对扫描深度的分析。设计选择性检索以跳过预排序表中不是最佳结果的元组。理论分析证明,选择性检索可以显着减少元组的检索数量。本文提出了所用结构的构造方法和增量更新/批处理方法。在合成和现实数据集上进行的广泛实验结果表明,T2S相对于现有算法具有明显优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号