首页> 外文会议>Big data >Extending High-Dimensional Indexing Techniques Pyramid and iMinMax(θ): Lessons Learned
【24h】

Extending High-Dimensional Indexing Techniques Pyramid and iMinMax(θ): Lessons Learned

机译:扩展金字塔和iMinMax(θ)的高维索引技术:经验教训

获取原文
获取原文并翻译 | 示例

摘要

Pyramid Technique and iMinMax(θ) are two popular high-dimensional indexing approaches that map points in a high-dimensional space to a single-dimensional index. In this work, we perform the first independent experimental evaluation of Pyramid Technique and iMinMax(θ), and discuss in detail promising extensions for testing k-Nearest Neighbor (kNN) and range queries. For datasets with skewed distributions, the parameters of these algorithms must be tuned to maintain balanced partitions. We show that, by using the medians of the distribution we can optimize these parameters. For the Pyramid Technique, different approximate median methods on data space partitioning are experimentally compared using kNN queries. For the iMinMax(θ), the default parameter setting and parameters tuned using the distribution median are experimentally compared using range queries. Also, as proposed in the iMinMax(θ) paper, we investigated the benefit of maintaining a parameter to account for the skewness of each dimension separately instead of a single parameter over all the dimensions.
机译:金字塔技术和iMinMax(θ)是两种流行的高维索引方法,它们将高维空间中的点映射到一维索引。在这项工作中,我们进行了金字塔技术和iMinMax(θ)的首次独立实验评估,并详细讨论了测试k最近邻(kNN)和范围查询的有希望的扩展。对于分布偏斜的数据集,必须调整这些算法的参数以保持平衡分区。我们表明,通过使用分布的中位数,我们可以优化这些参数。对于金字塔技术,使用kNN查询实验性地比较了数据空间分区的不同近似中值方法。对于iMinMax(θ),使用范围查询对默认参数设置和使用分布中值调整的参数进行了实验比较。同样,如iMinMax(θ)论文中所建议的那样,我们研究了维护参数以单独考虑每个维度的偏度而不是考虑所有维度上的单个参数的好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号