首页> 外文期刊>Data & Knowledge Engineering >The Active Vertice method: a performant filtering approach to high-dimensional indexing
【24h】

The Active Vertice method: a performant filtering approach to high-dimensional indexing

机译:Active Vertice方法:一种用于高维索引的高性能过滤方法

获取原文
获取原文并翻译 | 示例

摘要

The problem of finding nearest neighbors has emerged as an important foundation of feature-based similarity search in multimedia databases. Most spatial index structures based on the R-tree have failed to efficiently support nearest neighbor search in arbitrarily distributed high-dimensional data sets. In contrast, the so-called filtering principle as represented by the popular VA-file has turned out to be a more promising approach. Query processing is based on a flat file of compact vector approximations. In a first stage, those approximations are sequentially scanned and filtered so that in a second stage the nearest neighbors can be determined from a relatively small fraction of the data set. In this paper, we propose the Active Vertice method as a novel filtering approach. As opposed to the VA-file, approximation regions are arranged in a quad-tree like structure. High-dimensional feature vectors are assigned to ellipsoidal approximation regions on different levels of the tree. A compact approximation of a vector corresponds to the path within the index from the root to the respective tree node. When compared to the VA-file, our method enhances the discriminatory power of the approximations while maintaining their compactness in terms of storage consumption. To demonstrate its effectiveness, we conduct extensive experiments with synthetic as well as real-life data and show the superiority of our method over existing filtering approaches.
机译:寻找最近的邻居的问题已经成为多媒体数据库中基于特征的相似性搜索的重要基础。大多数基于R树的空间索引结构都无法有效地支持任意分布的高维数据集中的最近邻居搜索。相反,以流行的VA文件为代表的所谓过滤原理已被证明是一种更有希望的方法。查询处理基于紧凑向量逼近的平面文件。在第一阶段中,对这些近似值进行顺序扫描和滤波,以便在第二阶段中,可以从相对较小的数据集中确定最近的邻居。在本文中,我们提出了主动顶点方法作为一种新颖的过滤方法。与VA文件相反,近似区域以四叉树状结构排列。高维特征向量被分配给树的不同层上的椭圆近似区域。向量的紧凑近似值对应于索引中从根到各个树节点的路径。与VA文件相比,我们的方法增强了近似值的辨别能力,同时在存储消耗方面保持了紧凑性。为了证明其有效性,我们对合成数据和实际数据进行了广泛的实验,并证明了该方法相对于现有过滤方法的优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号