首页> 外文会议>2011 23rd IEEE International Conference on Tools with Artificial Intelligence >A Micro-cluster Based Ensemble Approach for Classifying Distributed Data Streams
【24h】

A Micro-cluster Based Ensemble Approach for Classifying Distributed Data Streams

机译:基于微簇的集成方法对分布式数据流进行分类

获取原文

摘要

Mining distributed data streams is a focus of much research in recent years, and it has brought many challenging problems. One of these problems is just learning and maintaining the global patterns from multiple data streams in distributed environments. In this paper, we discuss micro-cluster based classifying problems in distributed data streams, and propose the methods to mine data streams in the distributed environments oriented to both labeled and unlabeled data. For each local site, local micro-cluster based ensemble is used and its updating algorithms are designed. Making use of the time-based sliding window techniques, the local models in a fixed time-span are transferred to a central site after being generated in all local sites, and then the global patterns related to this time-span can be mined in the central site. In our methods, the global patterns are micro-cluster based rather than typical classifiers such decision trees, which can get expected classification accuracy when higher mining performance is assured. The experiment shows these methods are effective and efficient to classify multiple data streams in distributed environments.
机译:近年来,挖掘分布式数据流是许多研究的重点,它带来了许多具有挑战性的问题。这些问题之一只是从分布式环境中的多个数据流中学习并维护全局模式。在本文中,我们讨论了在分布式数据流中基于微簇的分类问题,并提出了在面向标记和未标记数据的分布式环境中挖掘数据流的方法。对于每个本地站点,都使用基于本地微集群的集成,并设计其更新算法。利用基于时间的滑动窗口技术,在所有本地站点中生成固定时间跨度的局部模型后,将其转移到中心站点,然后可以在与之相关的全局模式中挖掘与该时间跨度有关的全局模式。中央站点。在我们的方法中,全局模式是基于微集群的,而不是典型的分类器(例如决策树),当确保更高的挖掘性能时,可以获得预期的分类精度。实验表明,这些方法对于在分布式环境中对多个数据流进行分类是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号