...
首页> 外文期刊>Data mining and knowledge discovery >A drift detection method based on dynamic classifier selection
【24h】

A drift detection method based on dynamic classifier selection

机译:一种基于动态分类器选择的漂移检测方法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Machine learning algorithms can be applied to several practical problems, such as spam, fraud and intrusion detection, and customer preferences, among others. In most of these problems, data come in streams, which mean that data distribution may change over time, leading to concept drift. The literature is abundant on providing supervised methods based on error monitoring for explicit drift detection. However, these methods may become infeasible in some real-world applications-where there is no fully labeled data available, and may depend on a significant decrease in accuracy to be able to detect drifts. There are also methods based on blind approaches, where the decision model is updated constantly. However, this may lead to unnecessary system updates. In order to overcome these drawbacks, we propose in this paper a semi-supervised drift detector that uses an ensemble of classifiers based on self-training online learning and dynamic classifier selection. For each unknown sample, a dynamic selection strategy is used to choose among the ensemble's component members, the classifier most likely to be the correct one for classifying it. The prediction assigned by the chosen classifier is used to compute an estimate of the error produced by the ensemble members. The proposed method monitors such a pseudo-error in order to detect drifts and to update the decision model only after drift detection. The achievement of this method is relevant in that it allows drift detection and reaction and is applicable in several practical problems. The experiments conducted indicate that the proposed method attains high performance and detection rates, while reducing the amount of labeled data used to detect drift.
机译:机器学习算法可以应用于若干实际问题,例如垃圾邮件,欺诈和入侵检测,以及客户偏好等。在大多数这些问题中,数据进入溪流,这意味着数据分布可能随时间变化,导致概念漂移。文献在提供基于出现明确漂移检测的错误监控的监督方法方面是丰富的。然而,这些方法可能在某些现实世界应用中变得不可行 - 如果没有完全标记的数据可用,并且可能取决于能够检测漂移的精度显着降低。还存在基于盲方法的方法,其中决策模型不断更新。但是,这可能导致不必要的系统更新。为了克服这些缺点,我们提出了一个半监督漂移探测器,它使用基于自我训练在线学习和动态分类器选择的分类器的集合。对于每个未知的样本,使用动态选择策略用于在合奏的组件成员中进行选择,分类器最有可能是用于对其进行分类的正确策略。所选择的分类器分配的预测用于计算集合成员产生的错误的估计。所提出的方法监视这种伪误差,以便检测漂移并仅在漂移检测之后更新决策模型。实现该方法的实现是相关的,因为它允许漂移检测和反应,并且适用于若干实际问题。进行的实验表明,该方法达到了高性能和检测率,同时降低了用于检测漂移的标记数据量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号