首页> 外文期刊>Data & Knowledge Engineering >Array-index: a plug&search K nearest neighbors method for high-dimensional data
【24h】

Array-index: a plug&search K nearest neighbors method for high-dimensional data

机译:数组索引:用于高维数据的即插即用K最近邻方法

获取原文
获取原文并翻译 | 示例

摘要

Previous algorithms of data partitioning methods (DPMs) to find the exact K-nearest neighbors (KNN) at high dimensions are outperformed by a linear scan method [J.M. Kleinberg, Two algorithms for nearest neighbor search in high dimensions, 29th ACM Symposium on Theory of computing, 1997; R. Weber, H.-J. Schek, S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. in: Proc. of the 24th VLDB, USA, 1998]. In this paper, we present a "plug& search" method to greatly speed up the exact KNN search of existing DPMs. The idea is to linearize the data partitions produced by a DPM, rather than the points themselves, into a one-dimensional array-index, that is simple, compact and fast. Unlike most DPMs that support KNN search, which require storage space linear, or exponential [J.M. Kleinberg, Two algorithms for nearest neighbor search in high dimensions, 29th ACM Symposium on Theory of computing, 1997; M. Hagedoom, Nearest neighbors can be found efficiently if the dimension is small relative to the input size, ICDT 2003], in dimensions, the array-index requires a storage space that is linear in the number of mapped partitions.
机译:线性扫描方法优于以前的数据分割方法(DPM)的算法,该算法可在高维处找到确切的K近邻(KNN)。 Kleinberg,《高维中最邻近搜索的两种算法》,第29届ACM计算理论研讨会,1997年; R.Weber,H.-J.舍克,S。布洛特。高维空间中相似搜索方法的定量分析和性能研究。在:Proc。第24届VLDB,美国,1998年]。在本文中,我们提出了一种“即插即用”方法,可以大大加快对现有DPM的精确KNN搜索。这个想法是将DPM产生的数据分区(而不是点本身)线性化为简单,紧凑和快速的一维数组索引。与大多数支持KNN搜索的DPM不同,后者需要线性或指数存储空间[J.M. Kleinberg,《高维中最邻近搜索的两种算法》,第29届ACM计算理论研讨会,1997年; M. Hagedoom,如果尺寸相对于输入尺寸较小,则可以有效地找到最近的邻居(ICTT 2003)。在尺寸上,数组索引需要一个存储空间,该存储空间在映射分区的数量上是线性的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号