首页> 外文期刊>IEEE Transactions on Image Processing >Robust Quantization for General Similarity Search
【24h】

Robust Quantization for General Similarity Search

机译:通用相似性搜索的鲁棒量化

获取原文
获取原文并翻译 | 示例
       

摘要

The recent years have witnessed the emerging of vector quantization (VQ) techniques for efficient similarity search. VQ partitions the feature space into a set of codewords and encodes data points as integer indices using the codewords. Then the distance between data points can be efficiently approximated by simple memory lookup operations. By the compact quantization, the storage cost, and searching complexity are significantly reduced, thereby facilitating efficient large-scale similarity search. However, the performance of several celebrated VQ approaches degrades significantly when dealing with noisy data. In addition, it can barely facilitate a wide range of applications as the distortion measurement only limits to ℓ2 norm. To address the shortcomings of the squared Euclidean (ℓ2,2 norm) loss function employed by the VQ approaches, in this paper, we propose a novel robust and general VQ framework, named RGVQ, to enhance both robustness and generalization of VQ approaches. Specifically, a ℓp,q-norm loss function is proposed to conduct the ℓp-norm similarity search, rather than the ℓ2 norm search, and the q-th order loss is used to enhance the robustness. Despite the fact that changing the loss function to ℓp,q norm makes VQ approaches more robust and generic, it brings us a challenge that a non-smooth and non-convex orthogonality constrained ℓp,q-norm function has to be minimized. To solve this problem, we propose a novel and efficient optimization scheme and specify it to VQ approaches and theoretically prove its convergence. Extensive experiments on benchmark data sets demonstrate that the proposed RGVQ is better than the original VQ for several approaches, especially when searching similarity in noisy data.
机译:近年来,目睹了用于有效相似性搜索的矢量量化(VQ)技术的出现。 VQ将特征空间划分为一组代码字,并使用这些代码字将数据点编码为整数索引。然后,可以通过简单的存储器查找操作有效地估计数据点之间的距离。通过紧凑的量化,显着降低了存储成本和搜索复杂度,从而促进了有效的大规模相似性搜索。但是,在处理嘈杂的数据时,几种著名的VQ方法的性能会大大降低。此外,由于失真测量仅限于ℓ 2 范数,因此它几乎不能促进广泛的应用。为了解决VQ方法所使用的平方欧几里得(ℓ 2,2 范数)损失函数的缺点,在本文中,我们提出了一个新颖的健壮且通用的VQ框架RGVQ,以同时增强VQ方法的鲁棒性和一般性。具体来说,提出了ℓ p,q -范数损失函数来进行ℓ p -范数相似性搜索,而不是ℓ 2 范数搜索,q阶损失用于增强鲁棒性。尽管将损失函数更改为ℓ p,q 范数使VQ方法更加稳健和通用,但这给我们带来了一个挑战,即非光滑且非凸的正交性约束了ℓ p ,q -norm函数必须最小化。为了解决这个问题,我们提出了一种新颖有效的优化方案,并将其指定给VQ方法,并从理论上证明了其收敛性。在基准数据集上进行的大量实验表明,对于几种方法,特别是在嘈杂数据中搜索相似性时,建议的RGVQ优于原始VQ。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号