首页> 外文会议>20th British National Conference on Databases (BNCOD 20) Jul 15-17, 2003 Coventry, UK >Quantization Techniques for Similarity Search in High-Dimensional Data Spaces
【24h】

Quantization Techniques for Similarity Search in High-Dimensional Data Spaces

机译:高维数据空间中相似性搜索的量化技术

获取原文
获取原文并翻译 | 示例

摘要

In the recent years, several techniques have been developed for efficient similarity search in high-dimensional data spaces. Some of the techniques, based on the idea of vector approximation via quantization, have been shown to be the most effective. The VA-file was the first technique to use vector approximation. The IQ-tree and the A-tree are subsequent techniques that impose a directory structure over the quantized VA-file representation. The performance gains of the IQ-tree result mainly from an optimized I/O strategy permitted by the directory structure. Those of the A-tree result mainly from exploiting the clustering of the data itself. In our work, first we evaluate the relative performance of these two enhanced approaches over high-dimensional data sets with different clustering characteristics. Second, we present the Clustered IQ-Tree, which is an indexing strategy that combines the best features of the IQ-tree and the A-tree, leading to better query performance than the former and more stable performance than the latter across different types of data sets.
机译:近年来,已经开发了几种技术来在高维数据空间中进行有效的相似性搜索。基于通过量化矢量近似的思想,某些技术已被证明是最有效的。 VA文件是使用向量逼近的第一种技术。 IQ树和A树是随后的技术,它们在量化的VA文件表示上施加目录结构。 IQ树的性能提升主要来自目录结构允许的优化I / O策略。 A树的那些树主要是由于利用了数据本身的聚类。在我们的工作中,首先我们评估这两种增强方法在具有不同聚类特征的高维数据集上的相对性能。其次,我们介绍了集群IQ树,这是一种结合了IQ树和A树的最佳功能的索引策略,在不同类型的查询中,查询性能比前者更好,并且比后者更稳定。数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号