首页> 中文期刊> 《电子学报》 >基于LS H的高维大数据k近邻搜索算法

基于LS H的高维大数据k近邻搜索算法

         

摘要

The locality sensitive hashing (LSH)and its variants are efficient algorithms to solve the k nearest neigh-bor (kNN)search problems on high-dimensional data.However,with the increase of large data size,the traditional central-ized LSH algorithms cannot meet the challenge of the big data era.Based on a new AND-OR construction,this paper propo-ses an algorithm (called C2SLSH)for the k nearest neighbor search on big data.Different to the traditional algorithms,the C2SLSH can directly get the results from an index without having to compare the original data.The theoretical analysis and experimental results show that the algorithm has stable scalability on a distributed platform.Furthermore,it is faster than the conventional methods for about three times with the same accuracy rate.%局部敏感哈希(LSH)及其变体是解决高维数据k近邻(kNN)搜索的有效算法.但是,随着数据规模的日趋庞大,传统的集中式LSH算法结构已经不能够满足大数据时代的需求.本文分析传统LSH方案的不足之处,拓展AND-OR结构,提出通过索引而不比较原始数据直接实现高维大数据k近邻搜索算法C2SLSH.理论分析和实验证明, C2SLSH在分布式平台下具有稳定的可扩展性,在保证同等精确率的情况下,处理速度大约是现有方法的3倍.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号