首页> 外文期刊>Data & Knowledge Engineering >CM-tree: A dynamic clustered index for similarity search in metric databases
【24h】

CM-tree: A dynamic clustered index for similarity search in metric databases

机译:CM-tree:用于度量数据库中相似性搜索的动态聚簇索引

获取原文
获取原文并翻译 | 示例

摘要

Repositories of unstructured data types, such as free text, images, audio and video, have been recently emerging in various fields. A general searching approach for such data types is that of similarity search, where the search is for similar objects and similarity is modeled by a metric distance function. In this article we propose a new dynamic paged and balanced access method for similarity search in metric data sets, named CM-tree (Clustered Metric tree). It fully supports dynamic capabilities of insertions and deletions both of single objects and in bulk. Distinctive from other methods, it is especially designed to achieve a structure of tight and low overlapping clusters via its primary construction algorithms (instead of post-processing), yielding significantly improved performance. Several new methods are introduced to achieve this: a strategy for selecting representative objects of nodes, clustering based node split algorithm and criteria for triggering a node split, and an improved sub-tree pruning method used during search. To facilitate these methods the pairwise distances between the objects of a node are maintained within each node. Results from an extensive experimental study show that the CM-tree outperforms the M-tree and the Slim-tree, improving search performance by up to 312% for I/O costs and 303% for CPU costs.
机译:非结构化数据类型的存储库,例如自由文本,图像,音频和视频,最近已在各个领域出现。对于此类数据类型的通用搜索方法是相似性搜索,其中搜索是针对相似对象,并且相似性是通过度量距离函数建模的。在本文中,我们提出了一种新的动态分页和平衡访问方法,用于在度量标准数据集中进行相似性搜索,称为CM-tree(集群度量标准树)。它完全支持单个对象和批量对象的插入和删除的动态功能。与其他方法的区别在于,它特别设计为通过其主要构造算法(而不是后处理)实现紧密且低重叠集群的结构,从而显着提高了性能。为此,引入了几种新方法:选择节点的代表性对象的策略,基于聚类的节点拆分算法和触发节点拆分的条件,以及在搜索过程中使用的改进的子树修剪方法。为了促进这些方法,在每个节点内保持节点对象之间的成对距离。大量实验研究的结果表明,CM树优于M树和Slim树,将I / O成本和CPU成本的搜索性能提高了312%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号