A Recursive Partitioning Method for Nearest Neighbor Search in High Dimensional Data

Raghunadh Pasunuri; Sobha Rani T

首页> 外文期刊>Advances in Computer Science and Information Technology: ACSIT >A Recursive Partitioning Method for Nearest Neighbor Search in High Dimensional Data

【24h】

A Recursive Partitioning Method for Nearest Neighbor Search in High Dimensional Data

机译：高维数据中最近邻南搜索的递归分区方法

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many real world applications require searching a large amount of data to find the objects similar to the search queries. Nearest Neighbour Search is the common operation that has been used to conduct similarity search in vast areas like Content-Based Image Retrieval, Web search engines, micro array data analysis etc. Dimensionality forces us to look at data from a different perspective when dealing with such large data. In this work we propose a recursive partitioning and distance-based indexing scheme for large and high-dimensional data to retrieve the nearest neighbours for a given query. This method works by dividing the data space into disjoint partitions based on the distance from a reference point to the data objects. In the next level for each sub-partition a reference point is selected and again it is partitioned into further sub-sub-partitions. Main advantage of this method is that it reduces the search space. To find the nearest neighbours we use Soergel distance metric which is a dissimilarity-based distance metric to find the association between two data objects. We are able to retrieve the top 100 nearest neighbours (kNN) by searching in only a single bin where all the nearest neighbours lie for the given query according to the distance from a reference point. We have validated our method by conducting experiments with the following data sets: ZINC data set, AT and T Faces Database. Our results show that the proposed method is having a very less construction (off-line) time and search (on-line) time compared to brute-force linear scan. Proposed method reduces or prunes the search space from 100 percent to 10 percent or even less, which will save lot of computation time. We verified the proposed method with other methods and compared the performance and the results are presented.

机译：许多真实世界应用程序需要搜索大量数据以查找类似于搜索查询的对象。最近的邻居搜索是用于在基于内容的图像检索，Web搜索引擎，微阵列数据分析等广大领域进行相似性搜索的常见操作。维度迫使我们在处理此类时从不同的角度查看数据大数据。在这项工作中，我们提出了一种用于大型和高维数据的递归分区和距离的索引方案，以检索给定查询的最近邻居。该方法通过将数据空间划分为基于与来自数据对象的参考点的距离的距离分区。在每个子分区的下一个级别中，选择参考点，并再次将其分成另一个子分区。此方法的主要优点是它减少了搜索空间。为了找到最近的邻居，我们使用Sogle距离度量是一种基于不相似的距离度量，以找到两个数据对象之间的关联。我们能够通过仅在单个仓库中搜索前100名最近的邻居（KNN），其中所有最近的邻居根据与参考点的距离为给定查询。我们通过使用以下数据集进行实验验证了我们的方法：zinc数据集，AT和T Faces数据库。我们的结果表明，与Brute-Force线性扫描相比，所提出的方法具有非常少的结构（离线）时间和搜索（在线）时间。提出的方法将搜索空间从100％降至10％甚至10％甚至更少，这将节省大量的计算时间。我们用其他方法验证了所提出的方法，并验证了性能和结果。

著录项

来源
《Advances in Computer Science and Information Technology: ACSIT》 |2015年第11期|共5页
作者
Raghunadh Pasunuri; Sobha Rani T;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A nearest neighbor search algorithm of high-dimensional data based on sequential NPsim matrix [J] . 李文法高技术通讯（英文版） . 2016,第003期
2. Performance Comparison between Equal-Average Equal-Variance Equal-Norm Nearest Neighbor Search (EEENNS) Method and Improved Equal-Average Equal-Variance Nearest Neighbor Search (IEENNS) Method for Fast Encoding of Vector Quantization [J] . Zhibin PAN, Koji KOTANI, Tadahiro OHMI IEICE Transactions on Information and Systems . 2005,第9期

机译：用于矢量量化快速编码的平均平均等方差最近邻搜索（EEENNS）方法和改进的平均平均等方差最近邻搜索（IEENNS）方法之间的性能比较
3. Array-index: a plug&search K nearest neighbors method for high-dimensional data [J] . Zaher Al Aghbari Data & Knowledge Engineering . 2005,第3期

机译：数组索引：用于高维数据的即插即用K最近邻方法
4. A scalable solution to the nearest neighbor search problem through local-search methods on neighbor graphs [J] . Tellez Eric S., Ruiz Guillermo, Chavez Edgar, Pattern Analysis and Applications . 2021,第2期

机译：通过邻居图上的本地搜索方法将最近邻搜索问题的可扩展解决方案
5. Nearest Neighbor Search on Vertically Partitioned High-Dimensional Data [C] . Evangelos Dellis, Bernhard Seeger, Akrivi Vlachou International Conference on Data Warehousing and Knowledge Discovery(DaWaK 2005); 20050822-26; Copenhagen(DK) . 2005

机译：垂直分割的高维数据的最近邻搜索
6. A nearest neighbor search method suitable for low dimensions and location-dependent spatial queries in mobile computing. [D] . Gong, Peng. 2016

机译：适用于移动计算中的低维度和位置相关空间查询的最近邻居搜索方法。
7. A Fast Exact k-Nearest Neighbors Algorithm for High Dimensional Search Using k-Means Clustering and Triangle Inequality [O] . Xueyi Wang -1

机译：快速精确最近邻居法高维搜索使用K-均值聚类和三角不等式
8. Nearest Neighbor Search on Vertically Partitioned High-Dimensional Data [O] . Evangelos Dellis, Bernhard Seeger, Akrivi Vlachou 2009

机译：垂直分割的高维数据的最近邻搜索
9. Using the Random Nearest Neighbor Data Mining Method to Extract Maximum Information Content from Weather Forecasts from Multiple Predictors of Weather and One Predictand (Low-Level Turbulence). [R] . Keller, D. L. 2014

机译：使用随机最近邻数据挖掘方法从天气和一个预测的多个预测因子（低水平湍流）的天气预报中提取最大信息内容。

A Recursive Partitioning Method for Nearest Neighbor Search in High Dimensional Data

摘要

著录项

相似文献

相关主题

期刊订阅