...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >ClusterTree: integration of cluster representation and nearest-neighbor search for large data sets with high dimensions
【24h】

ClusterTree: integration of cluster representation and nearest-neighbor search for large data sets with high dimensions

机译:ClusterTree:集成群集表示和最近邻搜索以获取具有高维的大型数据集

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We introduce the ClusterTree, a new indexing approach for representing clusters generated by any existing clustering approach. A cluster is decomposed into several subclusters and represented as the union of the subclusters. The subclusters can be further decomposed, which isolates the most related groups within the clusters. A ClusterTree is a hierarchy of clusters and subclusters which incorporates the cluster representation into the index structure to achieve effective and efficient retrieval. Our cluster representation is highly adaptive to any kind of cluster. It is well accepted that most existing indexing techniques degrade rapidly as the dimensions increase. The ClusterTree provides a practical solution to index clustered data sets and supports the retrieval of the nearest-neighbors effectively without having to linearly scan the high-dimensional data set. We also discuss an approach to dynamically reconstruct the ClusterTree when new data is added. We present the detailed analysis of this approach and justify it extensively with experiments.
机译:我们介绍了ClusterTree,这是一种新的索引方法,用于表示由任何现有聚类方法生成的聚类。一个群集被分解为几个子群集,并表示为这些子群集的并集。子群集可以进一步分解,从而隔离出群集中最相关的组。 ClusterTree是群集和子群集的层次结构,它将群集表示形式合并到索引结构中以实现有效的检索。我们的集群表示形式非常适合任何类型的集群。众所周知,大多数现有的索引技术会随着尺寸的增加而迅速退化。 ClusterTree为索引聚簇数据集提供了一种实用的解决方案,并有效地支持了对最近邻居的检索,而不必线性扫描高维数据集。我们还讨论了一种在添加新数据时动态重建ClusterTree的方法。我们提出了这种方法的详细分析,并通过实验对其进行了广泛的论证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号