首页> 外文期刊>Journal of the Brazilian Computer Society >DBM-Tree: Trading height-balancing for performance in metric access methods
【24h】

DBM-Tree: Trading height-balancing for performance in metric access methods

机译:DBM-Tree:权衡高度平衡以提高度量标准访问方法的性能

获取原文
获取原文并翻译 | 示例
           

摘要

Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such as the range and the k-nearest neighbor queries. Current methods, such as the Slim-tree and the M-tree, improve the query performance minimizing the number of disk accesses, keeping a constant height of the structures stored on disks (height-balanced trees). However, the overlapping between their nodes has a very high influence on their performance. This paper presents a new dynamic MAM called theDBM-tree (Density-Based Metric tree), which can minimize the overlap between high-density nodes by relaxing the height-balancing of the structure. Thus, the height of the tree is larger in denser regions, in order to keep a tradeoff between breadth-searching and depth-searching. An underpinning for cost estimation on tree structures is their height, so we show a non-height dependable cost model that can be applied for DBM-tree. Moreover, an optimization algorithm calledShrink is also presented, which improves the performance of an already builtDBM-tree by reorganizing the elements among their nodes. Experiments performed over both synthetic and real world datasets showed that theDBM-tree is, in average, 50% faster than traditional MAM and reduces the number of distance calculations by up to 72% and disk accesses by up to 66%. After performing the Shrink algorithm, the performance improves up to 40% regarding the number of disk accesses for range andk-nearest neighbor queries. In addition, theDBM-tree scales up well, exhibiting linear performance with growing number of elements in the database.
机译:度量访问方法(MAM)用于加速相似性查询(例如范围和k最近邻查询)的处理。当前的方法(例如Slim树和M树)提高了查询性能,最大程度地减少了磁盘访问次数,并使存储在磁盘上的结构(高度平衡的树)保持恒定的高度。但是,节点之间的重叠对它们的性能有很大的影响。本文提出了一种新的动态MAM,称为DBM树(基于密度的度量树),它可以通过放宽结构的高度平衡来最大程度地减少高密度节点之间的重叠。因此,为了在宽度搜索和深度搜索之间保持折衷,树的高度在较密集的区域中更大。树结构成本估算的基础是它们的高度,因此我们展示了可用于DBM树的非高度可靠的成本模型。此外,还提出了一种称为收缩的优化算法,该算法通过在节点之间重组元素来提高已构建的DBM树的性能。对合成数据集和现实世界数据集进行的实验表明,DBM树平均比传统MAM快50%,并且距离计算次数最多减少72%,磁盘访问次数最多减少66%。执行Shrink算法后,对于范围和k最近邻居查询的磁盘访问次数,性能提高了40%。此外,DBM树的伸缩性很好,随着数据库中元素数量的增加,其线性性能也随之提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号