...
首页> 外文期刊>The Computer journal >On approximate algorithms for distance-based queries using R-trees
【24h】

On approximate algorithms for distance-based queries using R-trees

机译:关于使用R树的基于距离的查询的近似算法

获取原文
获取原文并翻译 | 示例
           

摘要

In modern database applications the similarity or dissimilarity of complex objects is examined by performing distance-based queries (DBQs) on data of high dimensionality. The R-tree and its variations are commonly cited multidimensional access methods that can beused for answering such queries. Although the related algorithms work well for low-dimensional data spaces, their performance degrades as the number of dimensions increases (dimensionality curse). In order to obtain acceptable response time in high-dimensional data spaces, algorithms that obtain approximate solutions can be used. Approximation techniques, like N-consider (based on the tree structure), alpha-allowance and epsilon-approximate (based on distance), or Time-consider (based on time) can be applied in branch-and-bound algorithms for DBQs inorder to control the trade-off between cost and accuracy of the result. In this paper, we improve previous approximate DBQ algorithms by applying a combination of the approximation techniques in the same query algorithm (hybrid approximation scheme). We investigate the performance of these improvements for one of the most representative DBQs (the K-closest pairs query, K-CPQ) in high-dimensional data spaces, as well as the influence of the algorithmic parameters on the control of the trade-off between the response time and the accuracy of the result. The outcome of the experimental evaluation, using synthetic and real datasets, is the derivation of the outperforming DBQ approximate algorithm for large high-dimensional point datasets.
机译:在现代数据库应用程序中,通过对高维数据执行基于距离的查询(DBQ)来检查复杂对象的相似性或不相似性。 R树及其变体是通常引用的多维访问方法,可用于回答此类查询。尽管相关算法在低维数据空间上运行良好,但其性能会随着维数的增加而降低(维数诅咒)。为了在高维数据空间中获得可接受的响应时间,可以使用获得近似解的算法。可以在分支定界算法中为DBQ顺序应用近似技术,例如N考虑(基于树结构),alpha允许和epsilon近似(基于距离)或Time-consider(基于时间)。在成本和结果准确性之间进行权衡。在本文中,我们通过在同一查询算法(混合近似方案)中应用近似技术的组合来改进先前的近似DBQ算法。我们调查了这些改进对于高维度数据空间中最有代表性的DBQ之一(K最近对查询,K-CPQ)的性能,以及算法参数对权衡控制的影响在响应时间和结果准确性之间。使用合成和真实数据集进行实验评估的结果是对大型高维点数据集使用了性能优于DBQ近似算法的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号