Grassmann Hashing for approximate nearest neighbor search in high dimensional space

机译：Grassmann Hashing在高维空间中进行近似最近邻搜索

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Locality-Sensitive Hashing (LSH) approximates nearest neighbors in high dimensions by projecting original data into low-dimensional subspaces. The basic idea is to hash data samples to ensure that the probability of collision is much higher for samples that are close to each other than for those that are far apart. However, by applying k random hashing functions on original data, LSH fails to find the most discriminant hashing-subspaces, so the nearest neighbor approximation is inefficient. To alleviate this problem, we propose the Grassmann Hashing (GRASH) for approximate nearest neighbor search in high dimensions. GRASH first introduces a set of subspace candidates from Linear Discriminant Analysis (LDA). Then it applies Grassmann metric to select the optimal subspaces for hashing. Finally, it generates hashing codes based on non-uniform bucket size design motivated by Lloyd-Max quantization. The proposed GRASH model enjoys a number of merits: 1) GRASH introduces the Grassmann metric to measure the similarity between different hashing subspaces, so the hashing function can better capture the data diversity; 2) GRASH obtains the subspace candidates from LDA, so it incorporates the discriminant information into the hashing functions; 3) GRASH extends LSH's 1-d hashing subspaces to m-d, i.e. it is a multidimensional extension of hashing approximation; 4) motivated by Lloyd-Max quantization, GRASH applies non-uniform size bucket to generate hashing codes, so the distortion can be minimized. Experimental results on a number of datasets confirm the validity of our proposed model.

机译：局部敏感散列（LSH）通过将原始数据投影到低维子空间中来近似高维中的最近邻居。基本思想是对数据样本进行哈希处理，以确保相互靠近的样本比相距较远的样本发生碰撞的可能性高得多。但是，通过对原始数据应用k个随机散列函数，LSH无法找到最有区别的散列子空间，因此最近的邻居近似效率不高。为了缓解此问题，我们提出了Grassmann Hashing（GRASH），用于在高维中进行近似最近的邻居搜索。 GRASH首先从线性判别分析（LDA）引入了一组子空间候选对象。然后，它应用Grassmann度量来选择用于散列的最佳子空间。最后，它基于Lloyd-Max量化的非均匀存储桶大小设计生成哈希码。提出的GRASH模型具有许多优点：1）GRASH引入了Grassmann度量来度量不同哈希子空间之间的相似性，因此哈希函数可以更好地捕获数据多样性。 2）GRASH从LDA获取候选子空间，因此将判别信息合并到哈希函数中; 3）GRASH将LSH的1-d哈希子空间扩展到m-d，即它是哈希近似的多维扩展; 4）在Lloyd-Max量化的推动下，GRASH应用非均匀大小的存储桶生成哈希码，因此可以将失真最小化。在许多数据集上的实验结果证实了我们提出的模型的有效性。

著录项

来源
《2011 IEEE International Conference on Multimedia and Expo》|2011年|p.1-6|共6页
会议地点
作者
Xinchao Wang; Zhu Li; Lei Zhang; Yuan Junsong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类多媒体技术与多媒体计算机;
关键词
grassmann manifold; hashing; optimization; subspace learning;

机译：草曼流形;散列;优化;子空间学习;

相似文献

外文文献
中文文献
专利

1. Approximate Nearest Neighbor Search towards Removing the Curse of Dimensionality Query-Aware and Locality-Sensitive Hashing [J] . S Ratna Kumari, D Durga Prasad International Journal of Computer Science and Technology . 2017,第4a1期

机译：近似最近邻搜索，以消除维度查询意识和位置敏感哈希的诅咒
2. Efficient search for approximate nearest neighbor in high dimensional spaces [J] . Kushilevitz E., Rabani Y., Ostrovsky R. SIAM Journal on Computing . 2000,第2期

机译：在高维空间中有效搜索近似最近的邻居
3. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions [J] . Amos Olagunju Computing reviews . 2010,第4期

机译：高维中近似最近邻居的近似最优哈希算法
4. Grassmann Hashing for approximate nearest neighbor search in high dimensional space [C] . Xinchao Wang, Zhu Li, Lei Zhang, IEEE International Conference on Multimedia and Expo . 2011

机译：大尺寸空间近似最近邻的基地散列
5. Fast Locality Sensitive Hashing Algorithm for Approximate Nearest Neighbor Search: A Practical Data Mining Approach. [D] . Buaba, Ruben. 2012

机译：近似最近邻居搜索的快速局部敏感哈希算法：一种实用的数据挖掘方法。
6. Approximate Nearest Neighbor Search by Residual Vector Quantization [O] . Yongjian Chen, Tao Guan, Cheng Wang 2010

机译：残差矢量量化的近似最近邻搜索
7. PAC Nearest Neighbor Queries: Approximate and Controlled Search in High-Dimensional and Metric Spaces [O] . Paolo Ciaccia Deis, Paolo Ciaccia 2000

机译：paC最近邻查询：高维和度量空间中的近似和受控搜索

Grassmann Hashing for approximate nearest neighbor search in high dimensional space

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅