首页> 外文期刊>Future generation computer systems >Evolving clustering algorithm based on mixture of typicalities for stream data mining
【24h】

Evolving clustering algorithm based on mixture of typicalities for stream data mining

机译:基于混合性的演化聚类算法的流数据挖掘

获取原文
获取原文并翻译 | 示例
       

摘要

Many applications have been producing streaming data nowadays, which motivates techniques to extract knowledge from such sources. In this sense, the development of data stream clustering algorithms has gained an increasing interest. However, the application of these algorithms in real systems remains a challenge, since data streams often come from non-stationary environments, which can affect the choice of a proper set of model parameters for fitting the data or finding a correct number of clusters. This work proposes an evolving clustering algorithm based on a mixture of typicalities. It is based on the TEDA framework and divide the clustering problem into two subproblems: micro-clusters and macro-clusters. Experimental results with benchmarking data sets showed that the proposed methodology can provide good results for clustering data and estimating its density even in the presence of events that can affect data distribution parameters, such as concept drifts. In addition, the model parameters were robust in relation to the state-of-the-art algorithms.
机译:如今,许多应用程序已在生成流数据,这激发了从此类资源中提取知识的技术。从这个意义上讲,数据流聚类算法的发展引起了越来越多的兴趣。但是,这些算法在实际系统中的应用仍然是一个挑战,因为数据流通常来自非平稳环境,这可能会影响选择一组合适的模型参数以拟合数据或找到正确数量的聚类。这项工作提出了一种基于混合性的进化聚类算法。它基于TEDA框架,将聚类问题分为两个子问题:微观集群和宏观集群。具有基准数据集的实验结果表明,即使存在可能影响数据分布参数(例如概念漂移)的事件,所提出的方法也可以为聚类数据和估计其密度提供良好的结果。另外,相对于最新算法,模型参数是可靠的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号