【24h】

Learned Sorted Table Search and Static Indexes in Small Model Space

机译:在小模型空间中学习排序表搜索和静态索引

获取原文
获取原文并翻译 | 示例

摘要

Machine Learning Techniques, properly combined with Data Structures, have resulted in Learned Static Indexes, innovative and powerful tools that speed-up Binary Search, with the use of additional space with respect to the table being searched into. Such space is devoted to the ML model. Although in their infancy, they are methodologically and practically important, due to the pervasiveness of Sorted Table Search procedures. In modern applications, model space is a key factor and, in fact, a major open question concerning this area is to assess to what extent one can enjoy the speed-up of Learned Indexes while using constant or nearly constant space models. We address it here by (a) introducing two new models, i.e., denoted KO-BFS and SY-RMI, respectively; (b) by systematically exploring, for the first time, the time-space tradeoffs of a hierarchy of existing models, i.e., the ones in SOSD, together with the new ones. We document a novel and rather complex time-space trade-off picture, which is very informative for users. We experimentally show that the KO-BFS can speed-up Interpolation Search and Uniform Binary Search in constant space. For other versions of Binary Search, our second model, together with the bi-criteria PGM index, can achieve a speed-up with a model space of 0.05% more than the one taken by the table, being competitive in terms of time-space trade-off with existing proposals. The SY-RMI and the bi-criteria PGM complement each other quite well across the various levels of the internal memory hierarchy. Finally, our findings are of interest to designers, since they highlight the need of further studies regarding the time-space relation in Learned Indexes.
机译:机器学习技术与数据结构适当结合,产生了学习静态索引,这是一种创新且强大的工具,可以加快二进制搜索的速度,并在搜索到的表中使用额外的空间。此类空间专门用于 ML 模型。尽管它们处于起步阶段,但由于 Sorted Table Search 程序的普遍性,它们在方法论和实践上都很重要。在现代应用中,模型空间是一个关键因素,事实上,关于该领域的一个主要悬而未决的问题是评估在使用恒定或几乎恒定空间模型时,可以在多大程度上享受学习索引的加速。我们在这里通过以下方式解决这个问题:(a) 介绍两个新模型,即分别表示 KO-BFS 和 SY-RMI;(b) 首次系统地探索了现有模型(即 SOSD 中的模型)与新模型层次结构的时空权衡。我们记录了一张新颖且相当复杂的时空权衡图,这对用户来说非常有用。我们实验表明,KO-BFS 可以在恒定空间中加速插值搜索和均匀二进制搜索。对于其他版本的二叉搜索,我们的第二个模型,连同双标准 PGM 指数,可以实现加速,模型空间比表所采用的模型多 0.05%,在时空权衡方面与现有提案具有竞争力。SY-RMI 和双标准 PGM 在内部存储器层次结构的各个级别上相辅相成。最后,我们的发现引起了设计师的兴趣,因为它们强调了对 Learned Indexes 中的时空关系进行进一步研究的必要性。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号