...
首页> 外文期刊>Knowledge and Information Systems >Effective classification of noisy data streams with attribute-oriented dynamic classifier selection
【24h】

Effective classification of noisy data streams with attribute-oriented dynamic classifier selection

机译:使用面向属性的动态分类器选择对噪声数据流进行有效分类

获取原文
获取原文并翻译 | 示例

摘要

Recently, mining from data streams has become an important and challenging task for many real-world applications such as credit card fraud protection and sensor networking. One popular solution is to separate stream data into chunks, learn a base classifier from each chunk, and then integrate all base classifiers for effective classification. In this paper, we propose a new dynamic classifier selection (DCS) mechanism to integrate base classifiers for effective mining from data streams. The proposed algorithm dynamically selects a single “best” classifier to classify each test instance at run time. Our scheme uses statistical information from attribute values, and uses each attribute to partition the evaluation set into disjoint subsets, followed by a procedure that evaluates the classification accuracy of each base classifier on these subsets. Given a test instance, its attribute values determine the subsets that the similar instances in the evaluation set have constructed, and the classifier with the highest classification accuracy on those subsets is selected to classify the test instance. Experimental results and comparative studies demonstrate the efficiency and efficacy of our method. Such a DCS scheme appears to be promising in mining data streams with dramatic concept drifting or with a significant amount of noise, where the base classifiers are likely conflictive or have low confidence.
机译:最近,从数据流中进行挖掘已成为许多现实应用(例如信用卡欺诈保护和传感器网络)中一项重要且具有挑战性的任务。一种流行的解决方案是将流数据分成多个块,从每个块中学习一个基本分类器,然后集成所有基本分类器以进行有效分类。在本文中,我们提出了一种新的动态分类器选择(DCS)机制,以集成基础分类器,以便从数据流中进行有效挖掘。所提出的算法动态选择一个“最佳”分类器以在运行时对每个测试实例进行分类。我们的方案使用来自属性值的统计信息,并使用每个属性将评估集划分为不相交的子集,然后执行一个过程,评估这些子集上每个基本分类器的分类准确性。给定一个测试实例,其属性值确定评估集中相似实例已构建的子集,并选择对那些子集具有最高分类精度的分类器对测试实例进行分类。实验结果和比较研究证明了我们方法的有效性和有效性。这样的DCS方案在挖掘具有严重概念漂移或大量噪声的数据流中似乎很有希望,其中基本分类器可能会发生冲突或置信度较低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号