针对传统密度聚类算法处理海量数据时间复杂度高且不适合处理动态数据等问题,提出一种利用参考点和MapReduce模型进行动态增量聚类的密度算法。其创新点在于,该算法实现了一种能够处理海量动态数据的聚类算法,保证了增量聚类与重新聚类结果的一致性,并具有可扩展性的特点。实验结果证明:该算法降低了参数敏感性,提高了密度算法的聚类效率和资源利用率,适合大数据分析。%For the problem of traditional density clustering algorithm that it is highly time complex and is not suitable for processing dynamic data when processing massive data,we proposed a density algorithm which uses reference points and MapReduce model for dynamic and incremental clustering.The creativity of it relies on that the algorithm realises a clustering algorithm capable of processing massive dynamic data,it guarantees the consistency of incremental clustering and re-clustering results,and has the characteristic of scalability as well. Experimental results demonstrated that the algorithm decreased the sensitivity of the parameter,improved the clustering efficiency and resource utilisation of density algorithm,and was suitable for big data analysis.
展开▼