CluStream algorithm has poor quality of clustering for non-spherical clusters, at the same time, most grid-based clustering algorithms improve the efficiency of clustering at the cost of reducing clustering accuracy. The paper gives a new kind of clustering algorithm for data stream-GTSClu, it is the minimum spanning tree data stream clustering algorithm based on grid, which is divided into online processing and offiine clustering, combining with grid resolution and minimum spanning tree techniques. GTSClu algorithm cannot only find clusters with arbitrary shape and amount, but also deal with noise data effectively, the efficiency and quality of clustering is improved.%针对CluStream算法对非球状簇聚类的不足,同时基于均匀网格划分的聚类算法多数是以降低聚类精度为代价来提高聚类效率,给出了一种新的数据流聚类算法一GTSClu算法,该算法是基于网格的最小生成树(MST)数据流聚类算法.算法分为在线处理与离线聚类两部分,并运用了网格拆分与最小生成树技术,可以有效排除噪声数据,发现任意形状的聚类,实验证明提高了聚类效率和质量.
展开▼