首页> 外文期刊>Concurrency and computation: practice and experience >Toward more efficient locality-sensitive hashing via constructing novel hash function cluster
【24h】

Toward more efficient locality-sensitive hashing via constructing novel hash function cluster

机译:通过构建新的哈希函数群集来朝着更有效的地区敏感散列

获取原文
获取原文并翻译 | 示例
           

摘要

Locality-sensitive hashing (LSH) is widely used in the context of nearest neighbor search of large-scale high-dimensions. However, there are serious imbalance problems between the efficiency of data index structure construction and the query accuracy of LSH methods. In this article, a novel higher-entropy-hyperplane clusters LSH (HEHC-LSH) algorithm is proposed, which we improve vector quantization to preprocess the data and greatly shortens the preprocessing time; We innovatively integrate the maximum entropy principle into the distribution estimation algorithm to construct a novel hash function cluster method, also incorporate bootstrap aggregating of ensemble learning, and adopt the parallel index dictionary to improve the generalization performance of the index structure. And in the query stage, we realize the comprehensive filtering of index set using integrated learning idea, which not only avoids a lot of distance calculation, but also improves the quality of query results. We also analyze the rationality and effectiveness of the proposed method. Finally, extensive experiment results show that HEHC-LSH can achieve more higher precision and efficiency simultaneously comparing to current methods, and reflect the strong robustness on different datasets.
机译:位置敏感散列(LSH)广泛用于最近邻的大型高维搜索的上下文。然而,数据索引结构构造效率与LSH方法的查询精度之间存在严重的不平衡问题。在本文中,提出了一种新颖的高熵超平面簇LSH(HEHC-LSH)算法,我们提高了向量量化以预处理数据,大大缩短了预处理时间;我们创新地将最大熵原理集成到分发估计算法中构建新的哈希函数簇方法,还包含了集合学习的引导集合,并采用并行索引字典来提高索引结构的泛化性能。在查询阶段,我们通过集成学习思路实现了索引集的全面过滤,这不仅避免了大量距离计算,而且提高了查询结果的质量。我们还分析了所提出的方法的合理性和有效性。最后,广泛的实验结果表明,HEHC-LSH可以同时实现更高的精度和效率,同时比较当前方法,并反映不同数据集的强大稳健性。

著录项

  • 来源
    《Concurrency and computation: practice and experience》 |2021年第20期|e6355.1-e6355.21|共21页
  • 作者单位

    Fujian Normal Univ Coll Math & Informat Fuzhou Fujian Peoples R China|Fujian Normal Univ Digit Fujian Internet Of Things Lab Environm Moni Fuzhou Fujian Peoples R China;

    Fujian Normal Univ Coll Math & Informat Fuzhou Fujian Peoples R China;

    Fujian Normal Univ Coll Math & Informat Fuzhou Fujian Peoples R China|Fujian Normal Univ Digit Fujian Internet Of Things Lab Environm Moni Fuzhou Fujian Peoples R China|Fujian Normal Univ Ctr Appl Math Fujian Prov Fuzhou Fujian Peoples R China;

    Fujian Normal Univ Coll Math & Informat Fuzhou Fujian Peoples R China;

    Fujian Normal Univ Coll Math & Informat Fuzhou Fujian Peoples R China|Fujian Normal Univ Digit Fujian Internet Of Things Lab Environm Moni Fuzhou Fujian Peoples R China;

    Fujian Normal Univ Coll Math & Informat Fuzhou Fujian Peoples R China|Fujian Normal Univ Digit Fujian Internet Of Things Lab Environm Moni Fuzhou Fujian Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    hash function cluster; locality#8208; sensitive hashing; maximum entropy principle; nearest neighbor search; parallel index dictionary; vector quantization;

    机译:哈希函数集群;地方‐敏感散列;最大熵原理;最近的邻居搜索;并行索引字典;矢量量化;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号