首页> 外文会议>European conference on machine learning and knowledge discovery in databases >Fast kNN Graph Construction with Locality Sensitive Hashing
【24h】

Fast kNN Graph Construction with Locality Sensitive Hashing

机译:具有局部敏感哈希的快速kNN图构建

获取原文

摘要

The k nearest neighbors (kNN) graph, perhaps the most popular graph in machine learning, plays an essential role for graph-based learning methods. Despite its many elegant properties, the brute force kNN graph construction method has computational complexity of O(n~2), which is prohibitive for large scale data sets. In this paper, based on the divide-and-conquer strategy, we propose an efficient algorithm for approximating kNN graphs, which has the time complexity of O(l(d + log n)n) only (d is the dimensionality and l is usually a small number). This is much faster than most existing fast methods. Specifically, we engage the locality sensitive hashing technique to divide items into small subsets with equal size, and then build one kNN graph on each subset using the brute force method. To enhance the approximation quality, we repeat this procedure for several times to generate multiple basic approximate graphs, and combine them to yield a high quality graph. Compared with existing methods, the proposed approach has features that are: (1) much more efficient in speed (2) applicable to generic similarity measures; (3) easy to parallelize. Finally, on three benchmark large-scale data sets, our method beats existing fast methods with obvious advantages.
机译:k最近邻居(kNN)图,可能是机器学习中最流行的图,对于基于图的学​​习方法起着至关重要的作用。尽管蛮力kNN图构造方法具有许多优雅的特性,但其计算复杂度为O(n〜2),这对于大规模数据集是不利的。本文基于分而治之的策略,提出了一种高效的kNN图逼近算法,其时间复杂度仅为O(l(d + log n)n)(d为维,l为通常数量很少)。这比大多数现有的快速方法要快得多。具体来说,我们采用局部敏感的哈希技术将项目分为大小相等的小子集,然后使用蛮力方法在每个子集上构建一个kNN图。为了提高逼近质量,我们重复此过程几次以生成多个基本逼近图,并将其组合以生成高质量图。与现有方法相比,所提出的方法具有以下特征:(1)速度上的效率更高(2)适用于通用相似性度量; (3)易于并行化。最后,在三个基准大规模数据集上,我们的方法以明显的优势击败了现有的快速方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号