首页> 外文会议>IEEE International Congress on Big Data >A comparative study of dual-tree algorithm implementations for computing 2-body statistics in spatial data
【24h】

A comparative study of dual-tree algorithm implementations for computing 2-body statistics in spatial data

机译:空间数据中2-body统计的双树算法实现的比较研究

获取原文

摘要

The 2-body correlation function (2-BCF) is a group of statistical measurements that found applications in many scientific domains. One type of 2-BCF named the Spatial Distance Histogram (SDH) is of vital importance in describing the physical features of natural systems. While a naïve way of computing SDH requires quadratic time, efficient algorithms based on resolving nodes in spatial trees have been developed. A key decision in the design of such algorithms is to choose a proper underlying data structure: our previous work utilizes quad-tree (oct-tree for 3-dimensional data) and in this paper we propose a kd-tree-based solution. Although it is easy to see that both implementations have the same time complexity O(N2d-1/d), where d is the number of dimensions of the dataset, a thorough comparison of their actual running time under different scenarios is conducted. In particular, we present an analytical model to rigorously quantify the running time of dual-tree algorithms. Our analysis suggests that the kd-tree-based implementation outperforms the quad-/oct-tree solution under all scenarios with different data sizes and query parameters. In particular, such performance advantage is shown as a speedup up to 1.23X over the quad-tree algorithm for 2D data. Results of extensive experiments run on synthetic and real datasets confirm our findings.
机译:2体相关函数(2-BCF)是一组统计测量,这些测量结果在许多科学域中发现了应用。一种名为空间距离直方图(SDH)的一种类型的2-BCF在描述自然系统的物理特征方面是至关重要的。虽然计算SDH的天真方式需要二次时间,但已经开发了基于空间树中的解析节点的高效算法。这种算法设计的关键决策是选择适当的基础数据结构:我们以前的工作利用四边形(oct树进行三维数据),并在本文中提出了基于KD树的解决方案。虽然很容易看出,两种实现都具有相同的时间复杂度O(N2D-1 / D),其中D是数据集的尺寸的数量,进行了在不同场景下其实际运行时间的彻底比较。特别是,我们提出了一个分析模型来严格量化双树算法的运行时间。我们的分析表明,基于KD-Tree的实现优于所有具有不同数据大小和查询参数的所有方案下的Quad-/ OCT树解决方案。特别地,在2D数据的四边形算法上显示出这种性能优势在高达1.23倍的加速。在合成和实时数据集上进行广泛实验的结果证实了我们的研究结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号