首页> 外文期刊>ACM Transactions on Information Systems >Region Proximity in Metric Spaces and Its Use for Approximate Similarity Search
【24h】

Region Proximity in Metric Spaces and Its Use for Approximate Similarity Search

机译:度量空间中的区域邻近度及其在近似相似度搜索中的应用

获取原文
获取原文并翻译 | 示例
       

摘要

Similarity search structures for metric data typically bound object partitions by ball regions. Since regions can overlap, a relevant issue is to estimate the proximity of regions in order to predict the number of objects in the regions' intersection. This paper analyzes the problem using a probabilistic approach and provides a solution that effectively computes the proximity through realistic heuristics that only require small amounts of auxiliary data. An extensive simulation to validate the technique is provided. An application is developed to demonstrate how the proximity measure can be successfully applied to the approximate similarity search. Search speedup is achieved by ignoring data regions whose proximity to the query region is smaller than a user-defined threshold. This idea is implemented in a metric tree environment for the similarity range and "nearest neighbors" queries. Several measures of efficiency and effectiveness are applied to evaluate proposed approximate search algorithms on real-life data sets. An analytical model is developed to relate proximity parameters and the quality of search. Improvements of two orders of magnitude are achieved for moderately approximated search results. We demonstrate that the precision of proximity measures can significantly influence the quality of approximated algorithms.
机译:度量数据的相似性搜索结构通常按球形区域限制对象分区。由于区域可能重叠,因此一个相关的问题是估计区域的接近度,以便预测区域交叉点中的对象数量。本文使用概率方法分析了问题,并提供了一种解决方案,可通过仅需要少量辅助数据的现实启发式方法有效地计算邻近度。提供了广泛的仿真来验证该技术。开发了一个应用程序来演示如何将接近度度量成功地应用于近似相似性搜索。通过忽略与查询区域的距离小于用户定义的阈值的数据区域来实现搜索加速。这个想法是在度量树环境中针对相似性范围和“最近邻居”查询实现的。应用了几种效率和有效性度量来评估在现实生活数据集上提出的近似搜索算法。开发了一种分析模型,以将邻近性参数与搜索质量相关联。对于中等近似的搜索结果,实现了两个数量级的改进。我们证明了接近度测量的精度会显着影响近似算法的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号