针对基于密度网格的数据流聚类算法中存在的缺陷进行改进,提出一种基于D-Stream算法的改进算法NDD-Stream.算法通过统计网格单元的密度与簇的数目,动态确定网格单元的密度阈值;对位于簇边界的网格单元采用不均匀划分,以提高簇边界的聚类精度.合成与真实数据集上的实验结果表明,算法能够在数据流对象上取得良好的聚类质量.%On the basis of improvements on defects in data stream clustering algorithm based on density grid, a data stream clustering algorithm was proposed which improved D-Stream algorithm. The algorithm set density threshold of grid cell dynamically by statistics on density of grid cell and number of clusters. To increase the precision of cluster boundary, a non-uniform division was employed on the grid boundary celL The result of experiments on synthetic and real data set shows that the algorithm has fast processing speed and the ability to detect dynamic changes of data for data stream clustering,and improves clustering quality.
展开▼