首页> 外文期刊>Information Processing & Management >Use of permutation prefixes for efficient and scalable approximate similarity search
【24h】

Use of permutation prefixes for efficient and scalable approximate similarity search

机译:使用置换前缀进行有效和可扩展的近似相似性搜索

获取原文
获取原文并翻译 | 示例
       

摘要

We present the Permutation Prefix Index (this work is a revised and extended version of Esuli (2009b), presented at the 2009 LSDS-IR Workshop, held in Boston) (PP-Index), an index data structure that supports efficient approximate similarity search.The PP-Index belongs to the family of the permutation-based indexes, which are based on representing any indexed object with "its view of the surrounding world", i.e., a list of the elements of a set of reference objects sorted by their distance order with respect to the indexed object.In its basic formulation, the PP-Index is strongly biased toward efficiency. We show how the effectiveness can easily reach optimal levels just by adopting two "boosting" strategies: multiple index search and multiple query search, which both have nice parallelization properties.We study both the efficiency and the effectiveness properties of the PP-Index, experimenting with collections of sizes up to one hundred million objects, represented in a very high-dimensional similarity space.
机译:我们介绍置换前缀索引(这项工作是Esuli(2009b)的修订和扩展版本,在波士顿举行的2009 LSDS-IR研讨会上进行了介绍)(PP-Index),该索引数据结构支持有效的近似相似性搜索.PP-Index属于基于置换的索引的族,这些索引基于具有“其周围环境的视图”的任何索引对象的表示,即,按其排序的一组参考对象的元素列表相对于索引对象的距离顺序.PP-Index在其基本公式中强烈偏向效率。我们展示了仅通过采用两种具有良好并行化特性的“提升”策略(即多索引搜索和多查询搜索)即可轻松达到最佳水平。我们研究了PP-Index的效率和有效性属性,进行了实验在一个非常高维的相似性空间中代表着多达一亿个对象的集合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号