首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >CRAFTER: A Tree-Ensemble Clustering Algorithm for Static Datasets with Mixed Attributes and High Dimensionality
【24h】

CRAFTER: A Tree-Ensemble Clustering Algorithm for Static Datasets with Mixed Attributes and High Dimensionality

机译:CRAFTER:具有混合属性和高维性的静态数据集的树组合聚类算法

获取原文
获取原文并翻译 | 示例

摘要

Clustering is an important aspect of data mining, while clustering high-dimensional mixed-attribute data in a scalable fashion still remains a challenging problem. In this paper, we propose a tree-ensemble clustering algorithm for static datasets, CRAFTER, to tackle this problem. CRAFTER is able to handle categorical and numeric attributes simultaneously, and scales well with the dimensionality and the size of datasets. CRAFTER leverages the advantages of a tree-ensemble to handle mixed attributes and high dimensionality. The concept of the class probability estimates is utilized to identify the representative data points for clustering. Through a series of experiments on both synthetic and real datasets, we have demonstrated that CRAFTER is superior than Random Forest Clustering (RFC), an existing tree-based clustering method, in terms of both the clustering quality and the computational cost.
机译:集群是数据挖掘的重要方面,而以可伸缩方式集群高维混合属性数据仍然是一个具有挑战性的问题。在本文中,我们提出了一种针对静态数据集的树群聚类算法CRAFTER来解决此问题。 CRAFTER能够同时处理类别和数字属性,并且可以随数据集的维度和大小进行很好的缩放。 CRAFTER利用树集合的优势来处理混合属性和高维。利用类别概率估计的概念来识别用于聚类的代表性数据点。通过在合成数据集和真实数据集上进行的一系列实验,我们证明了CRAFTER在聚类质量和计算成本方面均优于现有的基于树的聚类方法Random Forest Clustering(RFC)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号