【24h】

An Evaluation of Generic Bulk Loading Techniques

机译:通用批量加载技术的评估

获取原文
获取原文并翻译 | 示例

摘要

Bulk loading refers to the process of creating an index from scratch for a given data set. This problem is well understood for B-trees, but so far, non-traditional index structures received modest attention. We are particularly interested in fast generic bulk loading techniques whose implementations only employ a small interface that is satisfied by a broad class of index structures. Generic techniques are very attractive to extensible database systems since different user-implemented index structures implementing that small interface can be bulk-loaded without any modification of the generic code. The main contribution of the paper is the proposal of two new generic and conceptually simple bulk loading algorithms. These algorithms recursively partition the input by using a main-memory index of the same type as the target index to be build. In contrast to previous generic bulk loading algorithms, the implementation of our new algorithms turns out to be much easier. Another advantage is that our new algorithms possess fewer parameters whose settings have to be taken into consideration. An experimental performance comparison is presented where different bulk loading algorithms are investigated in a system-like scenario. Our experiments are unique in the sense that we examine the same code for different index structures (R-tree and Slim-tree). The results consistently indicate that our new algorithms outperform asymptotically worst-case optimal competitors. Moreover, the search quality of the target index will be better when our new bulk loading algorithms are used.
机译:批量加载是指从头开始为给定数据集创建索引的过程。对于B树,这个问题已广为人知,但是到目前为止,非传统索引结构受到了适度的关注。我们对快速通用的批量加载技术特别感兴趣,该技术的实现仅使用一个小型接口,而该接口可以被广泛的索引结构所满足。通用技术对可扩展数据库系统非常有吸引力,因为实现该小接口的不同用户实现的索引结构可以批量加载,而无需修改通用代码。本文的主要贡献是提出了两种新的通用且概念上简单的批量加载算法。这些算法通过使用与要构建的目标索引相同类型的主内存索引来递归划分输入。与以前的通用批量加载算法相比,我们的新算法的实现变得更加容易。另一个优点是我们的新算法拥有较少的参数,必须考虑其设置。提出了实验性能比较,其中在类似系统的情况下研究了不同的批量加载算法。在针对不同的索引结构(R树和Slim树)检查相同代码的意义上,我们的实验是独一无二的。结果始终表明,我们的新算法优于渐进式最坏情况下的最优竞争者。此外,使用我们的新批量加载算法后,目标索引的搜索质量会更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号