首页> 外文会议>IEEE International Conference on Data Engineering >I-LSH: I/O Efficient c-Approximate Nearest Neighbor Search in High-Dimensional Space
【24h】

I-LSH: I/O Efficient c-Approximate Nearest Neighbor Search in High-Dimensional Space

机译:I-LSH:高维空间中的I / O高效的C - 近似最近邻搜索

获取原文

摘要

Nearest Neighbor search has been well solved in low-dimensional space, but is challenging in high-dimensional space due to the curse of dimensionality. As a trade-off between efficiency and result accuracy, a variety of c-approximate nearest neighbor (c-ANN) algorithms have been proposed to return a c-approximate NN with confident at least δ. We observe that existing c-ANN search algorithms have some limitations on I/O efficiency when their indexes are resided on the external memory, which is critical for handling large scale high-dimensional data. In this paper, we introduce an incremental search based c-ANN search algorithm, named I-LSH. Unlike the previous LSH methods, which expand the bucket width in an exponential way, I-LSH adopts a more natural search strategy to incrementally access the hash values of the objects. We provide rigorous theoretical analysis to underpin our incremental search strategy. Our comprehensive experiment results show that, compared with state-of-the-art I/O efficient c-ANN techniques, our algorithm can achieve much better I/O efficiency under the same theoretical guarantee.
机译:最近的邻居搜索在低维空间中得到很好的解决,但由于维度诅咒,在高维空间中挑战。作为效率和结果准确性之间的权衡,已经提出了各种C近似邻(C-ANN)算法以返回至少δ的自信地返回C近似NN。我们观察到,当它们的索引驻留在外部存储器上时,现有的C-ANN搜索算法对I / O效率有一些限制,这对于处理大规模的高维数据至关重要。在本文中,我们引入了基于增量搜索的C-Ann搜索算法,名为I-LSH。与以前的LSH方法以指数方式扩展桶宽度不同,I-LSH采用更自然的搜索策略来逐步访问对象的哈希值。我们提供严格的理论分析来支撑我们的增量搜索策略。我们的综合实验结果表明,与最先进的I / O高效的C-ANN技术相比,我们的算法可以在相同的理论保证下实现更好的I / O效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号