首页> 外文会议>Artificial neural networks in pattern recognition >SIC-Means: A Semi-fuzzy Approach for Clustering Data Streams Using C-Means
【24h】

SIC-Means: A Semi-fuzzy Approach for Clustering Data Streams Using C-Means

机译:SIC-Means:使用C-Means进行数据流聚类的半模糊方法

获取原文
获取原文并翻译 | 示例

摘要

In recent years, data streaming has gained a significant importance. Advances in both hardware devices and software technologies enable many applications to generate continuous flows of data. This increases the need to develop algorithms that are able to efficiently process data streams. Additionaly, real-time requirements and evolving nature of data streams make stream mining problems, including clustering, challenging research problems. Fuzzy solutions are proposed in the literature for clustering data streams. In this work, we propose a Soft Jncremental C-Means variant to enhance the fuzzy approach performance. The experimental evaluation has shown better performance for our approach in terms of Xie-Beni index compared with the pure fuzzy approach with changing different factors that affect the clustering results. In addition, we have conducted a study to analyze the sensitivity of clustering results to the allowed fuzziness level and the size of data history used. This study has shown that different datasets behave differently with changing these factors. Dataset behavior is correlated with the separation between clusters of the dataset.
机译:近年来,数据流已变得非常重要。硬件设备和软件技术的进步使许多应用程序能够生成连续的数据流。这增加了开发能够有效处理数据流的算法的需求。另外,实时需求和数据流的不断变化的性质使流挖掘问题变得十分棘手,其中包括集群,挑战性的研究问题。文献中提出了用于对数据流进行聚类的模糊解决方案。在这项工作中,我们提出了一个软增量C均值变体,以增强模糊进近性能。实验评估表明,与单纯的模糊方法相比,通过改变影响聚类结果的不同因素,我们的方法在谢贝尼指数方面具有更好的性能。此外,我们进行了一项研究,以分析聚类结果对允许的模糊程度和所使用的数据历史记录大小的敏感性。这项研究表明,随着这些因素的变化,不同的数据集的行为也会有所不同。数据集行为与数据集聚类之间的间隔相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号