Fast algorithms for projected clustering

机译：投影聚类的快速算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The clustering problem is well known in the database literature for its numerous applications in problems such as customer segmentation, classification and trend analysis. Unfortunately, all known algorithms tend to break down in high dimensional spaces because of the inherent sparsity of the points. In such high dimensional spaces not all dimensions may be relevant to a given cluster. One way of handling this is to pick the closely correlated dimensions and find clusters in the corresponding subspace. Traditional feature selection algorithms attempt to achieve this. The weakness of this approach is that in typical high dimensional data mining applications different sets of points may cluster better for different subsets of dimensions. The number of dimensions in each such cluster-specific subspace may also vary. Hence, it may be impossible to find a single small subset of dimensions for all the clusters. We therefore discuss a generalization of the clustering problem, referred to as the projected clustering problem, in which the subsets of dimensions selected are specific to the clusters themselves. We develop an algorithmic framework for solving the projected clustering problem, and test its performance on synthetic data.

机译：

聚类问题在数据库文献中是众所周知的，因为它在诸如客户细分，分类和趋势分析等问题中的大量应用。不幸的是，由于点的固有稀疏性，所有已知算法都倾向于在高维空间中分解。在这样的高维空间中，并非所有维都可能与给定簇相关。处理此问题的一种方法是选择紧密相关的维，并在相应的子空间中找到聚类。传统的特征选择算法试图实现这一点。这种方法的缺点是，在典型的高维数据挖掘应用程序中，对于不同的维子集，不同的点集可能会更好地聚类。每个此类特定于群集的子空间中的维数也可能有所不同。因此，可能无法为所有群集找到单个小的尺寸子集。因此，我们讨论了聚类问题的一般化，称为投影聚类问题，其中所选维的子集特定于聚类本身。我们开发了解决投影聚类问题的算法框架，并测试了其在综合数据上的性能。展开▼

著录项

来源
《ACM SIGMOD international conference on Management of data》|1999年|P.61-72|共12页
会议地点
作者
Charu C. Aggarwal; Joel L. Wolf; Philip S. Yu; Cecilia Procopiuc; Jong Soo Park; PCharu C. Aggarwal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类各种专用数据库;
关键词

相似文献

外文文献
中文文献
专利

1. Clustering High Dimensional Data Using Subspace and Projected Clustering Algorithms [J] . Rahmat Widia Sembiring, Jasni Mohamad Zain, Abdullah Embong International Journal of Computer Science & Information Technology (IJCSIT) . 2010,第4期

机译：使用子空间和投影聚类算法对高维数据进行聚类
2. A Fast Projection-Based Algorithm for Clustering Big Data [J] . Wu Yun, He Zhiquan, Lin Hao, Interdisciplinary Sciences: Computational Life Sciences . 2019,第3期

机译：基于快速投影的大数据算法
3. A quadratic programming based cluster correspondence projection algorithm for fast point matching [J] . Wei Lian, Lei Zhang, Yan Liang, Computer vision and image understanding . 2010,第3期

机译：基于二次规划的聚类对应投影快速点匹配算法
4. Fast Algorithms for Projected Clustering [C] . Charu C. Aggarwal, Cecilia Procopiuc, Joel L. Wolf, ACM SIGMOD International Conference on Management of Data . 1999

机译：投影聚类的快速算法
5. Fast conceptual clustering algorithms for data mining and visualization. [D] . Moustafa, Rida E. A. 2001

机译：用于数据挖掘和可视化的快速概念性聚类算法。
6. Fast Nonnegative Matrix Factorization Algorithms Using Projected Gradient Approaches for Large-Scale Problems [O] . Rafal Zdunek, Andrzej Cichocki 2008

机译：求解大型问题的基于投影梯度法的快速非负矩阵分解算法
7. Fast algorithms for projected clustering [O] . Charu C. Aggarwal, Cecilia Procopiuc, Joel L. Wolf, 1999

机译：用于投影聚类的快速算法

Fast algorithms for projected clustering

摘要

著录项

相似文献

相关主题

期刊订阅