首页> 外文期刊>ACM transactions on multimedia computing communications and applications >Approximate Asymmetric Search for Binary Embedding Codes
【24h】

Approximate Asymmetric Search for Binary Embedding Codes

机译:二进制嵌入代码的近似非对称搜索

获取原文
获取原文并翻译 | 示例

摘要

In this article, we propose a method of approximate asymmetric nearest-neighbor search for binary embedding codes. The asymmetric distance takes advantage of less information loss at the query side. However, calculating asymmetric distances through exhaustive search is prohibitive in a large-scale dataset. We present a novel method, called multi-index voting, that integrates the multi-index hashing technique with a voting mechanism to select appropriate candidates and calculate their asymmetric distances. We show that the candidate selection scheme can be formulated as the tail of the binomial distribution function. In addition, a binary feature selection method based on minimal quantization error is proposed to address the memory insufficiency issue and improve the search accuracy. Substantial experimental evaluations were made to demonstrate that the proposed method can yield an approximate accuracy to the exhaustive search method while significantly accelerating the runtime. For example, one result shows that in a dataset of one billion 256-bit binary codes, examining only 0.5% of the dataset, can reach 95-99% close accuracy to the exhaustive search method and accelerate the search by 73-128 times. It also demonstrates an excellent tradeoff between the search accuracy and time efficiency compared to the state-of-the-art nearest-neighbor search methods. Moreover, the proposed feature selection method shows its effectiveness and improves the accuracy up to 8.35% compared with other feature selection methods.
机译:在本文中,我们提出了一种用于二进制嵌入代码的近似非对称最近邻搜索方法。非对称距离利用了查询侧较少信息丢失的优势。但是,在大型数据集中,通过穷举搜索计算不对称距离是禁止的。我们提出了一种称为多索引投票的新颖方法,该方法将多索引哈希技术与投票机制集成在一起,以选择合适的候选者并计算其不对称距离。我们表明,候选人选择方案可以表述为二项分布函数的尾部。另外,提出了一种基于最小量化误差的二值特征选择方法,以解决内存不足的问题,提高搜索精度。进行了大量的实验评估,证明了所提出的方法可以显着提高穷举搜索方法的准确性,同时显着加快运行时间。例如,一个结果表明,在十亿个256位二进制代码的数据集中,仅检查该数据集的0.5%,可以达到穷举搜索方法的95-99%的接近准确度,并将搜索速度提高73-128倍。与最新的近邻搜索方法相比,它还展示了搜索精度和时间效率之间的极佳折衷。此外,与其他特征选择方法相比,所提出的特征选择方法显示了其有效性,并将精度提高了8.35%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号