首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Efficient Similarity Search in Nonmetric Spaces with Local Constant Embedding
【24h】

Efficient Similarity Search in Nonmetric Spaces with Local Constant Embedding

机译:具有局部常数嵌入的非度量空间中的有效相似性搜索

获取原文
获取原文并翻译 | 示例

摘要

Similarity-based search has been a key factor for many applications, such as multimedia retrieval, data mining, web search and retrieval, and so on. There are two important issues related to the the similarity search, namely the design of a distance function to measure the similarity, and improving the search efficiency. Many distance functions have been proposed that attempt to closely mimic human recognition. Unfortunately, some of these well-designed distance functions do not follow the triangle inequality, and are, therefore, non-metric. As a consequence, efficient retrieval using these non-metric distance functions becomes more challenging, since most existing index structures assume that the indexed distance functions are metric. In this paper, we address this challenging problem by proposing an efficient method, local constant embedding (LCE), which divides the data set into disjoint groups, so that the triangle inequality holds within each group by constant shifting. Furthermore, we design a pivot selection approach for the converted metric distance and create an index structure to speed up the retrieval efficiency. Extensive experiments show that, our method works well on various non-metric distance functions and improves the retrieval efficiency by an order of magnitude compared to the linear scan and existing retrieval approaches with no false dismissals.
机译:基于相似度的搜索已成为许多应用程序的关键因素,例如多媒体检索,数据挖掘,Web搜索和检索等。与相似度搜索相关的两个重要问题,即用于测量相似度的距离函数的设计和提高搜索效率。已经提出了许多距离函数,这些距离函数试图模仿人类的识别。不幸的是,其中一些精心设计的距离函数没有遵循三角形不等式,因此是非度量的。结果,由于大多数现有索引结构假定索引的距离函数是度量的,因此使用这些非度量距离函数的有效检索变得更具挑战性。在本文中,我们通过提出一种有效的方法,即局部常量嵌入(LCE),将数据集分为不相交的组,从而通过不断移动将三角形不等式保持在每个组中,从而解决了这一难题。此外,我们为转换后的度量距离设计了枢轴选择方法,并创建了索引结构以加快检索效率。大量的实验表明,与线性扫描和现有的无误解的检索方法相比,我们的方法在各种非度量距离函数上都能很好地工作,并且将检索效率提高了一个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号