首页> 外文期刊>ACM transactions on knowledge discovery from data >Multi-Label Punitive kNN with Self-Adjusting Memory for Drifting Data Streams
【24h】

Multi-Label Punitive kNN with Self-Adjusting Memory for Drifting Data Streams

机译:具有自调整内存的多标签惩罚性kNN,可用于漂移数据流

获取原文
获取原文并翻译 | 示例

摘要

In multi-label learning, data may simultaneously belong to more than one class. When multi-label data arrives as a stream, the challenges associated with multi-label learning are joined by those of data stream mining, including the need for algorithms that are fast and flexible, able to match both the speed and evolving nature of the stream. This article presents a punitive k nearest neighbors algorithm with a self-adjusting memory (MLSAMPkNN) for multi-label, drifting data streams. The memory adjusts in size to contain only the current concept and a novel punitive system identifies and penalizes errant data examples early, removing them from the window. By retaining and using only data that are both current and beneficial, MLSAMPkNN is able to adapt quickly and efficiently to changes within the data stream while still maintaining a low computational complexity. Additionally, the punitive removal mechanism offers increased robustness to various data-level difficulties present in data streams, such as class imbalance and noise. The experimental study compares the proposal to 24 algorithms using 30 real-world and 15 artificial multi-label data streams on six multi-label metrics, evaluation time, and memory consumption. The superior performance of the proposed method is validated through non-parametric statistical analysis, proving both high accuracy and low time complexity. MLSAMPkNN is a versatile classifier, capable of returning excellent performance in diverse stream scenarios.
机译:在多标签学习中,数据可能同时属于一个以上的类别。当多标签数据作为流到达时,与多标签学习相关的挑战会伴随着数据流挖掘的挑战,包括对快速,灵活,能够匹配流的速度和不断发展的性质的算法的需求。本文提出了一种惩罚性k最近邻算法,该算法具有用于多标签,漂移数据流的自调整内存(MLSAMPkNN)。内存的大小调整为仅包含当前概念,并且新颖的惩罚性系统可以尽早识别并惩罚错误的数据示例,并将其从窗口中删除。通过仅保留和使用既有用又有用的数据,MLSAMPkNN能够快速而有效地适应数据流中的变化,同时仍保持较低的计算复杂度。此外,惩罚性删除机制还提高了数据流中存在的各种数据级别困难(例如类别不平衡和噪声)的鲁棒性。实验研究将提案与24种算法进行了比较,其中使用了30种真实世界和15种人工多标签数据流,并采用了六个多标签指标,评估时间和内存消耗。通过非参数统计分析验证了该方法的优越性能,证明了该方法的高精度和低时间复杂度。 MLSAMPkNN是一种通用分类器,能够在各种流场景中返回出色的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号