首页> 外文会议>European Conference on IR Research >Compressing Inverted Indexes with Recursive Graph Bisection: A Reproducibility Study
【24h】

Compressing Inverted Indexes with Recursive Graph Bisection: A Reproducibility Study

机译:用递归图分类压缩倒置索引:再现性研究

获取原文

摘要

Document reordering is an important but often overlooked preprocessing stage in index construction. Reordering document identifiers in graphs and inverted indexes has been shown to reduce storage costs and improve processing efficiency in the resulting indexes. However, surprisingly few document reordering algorithms are publicly available despite their importance. A new reordering algorithm derived from recursive graph bisection was recently proposed by Dhulipala et al., and shown to be highly effective and efficient when compared against other state-of-the-art reordering strategies. In this work, we present a reproducibility study of this new algorithm. We describe the implementation challenges encountered, and explore the performance characteristics of our clean-room reimplementation. We show that we are able to successfully reproduce the core results of the original paper, and show that the algorithm generalizes to other collections and indexing frameworks. Furthermore, we make our implementation publicly available to help promote further research in this space.
机译:文件重新排序是指数建设中的重要但经常被忽视的预处理阶段。已经显示了重新排序图中的文档标识符和反转索引,以降低存储成本并提高所得索引中的处理效率。然而,令人惊讶的是,尽管重要的是,很少有文件重新排序算法是公开的。最近由Dhulipala等人提出了一种新的重新排序算法,由Dhulipala等人提出,并在与其他最先进的重新排序策略相比时表现为高效和高效。在这项工作中,我们展示了这种新算法的可重复性研究。我们描述了遇到的实施挑战,并探讨了洁净室重新实现的性能特征。我们表明我们能够成功再现原始纸张的核心结果,并显示算法推广到其他集合和索引框架。此外,我们将公开实施的实施,以帮助促进此空间的进一步研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号