首页> 外文期刊>Algorithmica >Lower Bounds on Performance of Metric Tree Indexing Schemes for Exact Similarity Search in High Dimensions
【24h】

Lower Bounds on Performance of Metric Tree Indexing Schemes for Exact Similarity Search in High Dimensions

机译:高维精确相似搜索的度量树索引方案性能的下界

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Within a mathematically rigorous model, we analyse the curse of dimensionality for deterministic exact similarity search in the context of popular indexing schemes: metric trees. The datasets X are sampled randomly from a domainΩ, equipped with a distance,ρ, and an underlying probability distribution, μ. While performing an asymptotic analysis, we send the intrinsic dimension d of Ω to infinity, and assume that the size of a dataset, n, grows superpolynomially yet subexponen-tially in d. Exact similarity search refers to finding the nearest neighbour in the dataset X to a query point ω∈ Ω, where the query points are subject to the same probability distribution μ as datapoints. Let F denote a class of all 1-Lipschitz functions on fi that can be used as decision functions in constructing a hierarchical metric tree indexing scheme. Suppose the VC dimension of the class of all sets {ω: f(ω) ≥ a), a ∈ R is o(n~(1/4) log~2 n). (In view of a 1995 result of Goldberg and Jerrum, even a stronger complexity assumption d~(O(1)) is reasonable.) We deduce the Ω(n~(1/4)) lower bound on the expected average case performance of hierarchical metric-tree based indexing schemes for exact similarity search in (Ωi, X). In paricular, this bound is superpolynomial in d.
机译:在严格的数学模型中,我们分析了在流行索引方案(度量树)的上下文中进行确定性精确相似搜索的维数诅咒。数据集X是从具有距离ρ和潜在概率分布μ的域Ω中随机采样的。在执行渐近分析时,我们将Ω的固有维数d传递到无穷大,并假定数据集n的大小在多项式上呈超多项式增长,但在d上呈指数级增长。精确相似性搜索是指在数据集中X中找到与查询点ω∈Ω最接近的邻居,其中查询点与数据点的概率分布μ相同。令F表示fi上所有1-Lipschitz函数的一类,可用作构造分层度量树索引方案的决策函数。假设所有集合的类的VC维数{ω:f(ω)≥a),则a∈R为o(n〜(1/4)log〜2 n)。 (考虑到1995年Goldberg和Jerrum的结果,甚至更复杂的假设d〜(O(1))都是合理的。)我们推导出预期平均案例性能的Ω(n〜(1/4))下界(Ωi,X)中用于精确相似性搜索的基于分层度量树的索引方案的说明。特别地,该界是d中的超多项式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号