K-means clustering as classical clustering algorithm is sensitive to noise.In practical applications,the data usually contain many noises and this makes it difficult to obtain a good clustering result.This paper proposes a K-means clustering algorithm with noise pro-cessing.The algorithm divides original space to several regions dynamically,and calculates the weighted similarity matrix of sample and each regional centroid using correlated regional density and uses it as the input of K-means algorithm.The matrix effectively describes the distribu-tion information of data and at the same time realises the dimensionality reduction of features so that the clustering tasks with noise data can be processed more effectively.The proposed algorithm is more suitable for the situation of complex data distribution.Experimental result proves the effectiveness of the algorithm.%K-means 作为经典的聚类算法,对噪音很敏感。在实际应用中,数据通常包含较多噪音,聚类难以得到良好的效果。提出一种含噪音处理的 K-means 聚类算法。算法将原空间动态地划分成若干个区域,利用对应的区域密度加权计算样本与每个区域质心的相似度矩阵,作为 K-means 的输入。该矩阵有效描述了数据的分布信息,同时实现了特征的降维,能更有效处理带噪音数据的聚类任务,更适用于数据分布复杂的情况。实验结果证实了此算法的有效性。
展开▼