首页> 外文期刊>International Journal of Data Mining & Knowledge Management Process >Incremental Learning From Unbalanced Data with Concept Class, Concept Drift and Missing Features : A Review
【24h】

Incremental Learning From Unbalanced Data with Concept Class, Concept Drift and Missing Features : A Review

机译:从具有概念类别,概念漂移和缺失特征的不平衡数据中进行增量学习:回顾

获取原文
       

摘要

Recently, stream data mining applications has drawn vital attention from several research communities.Stream data is continuous form of data which is distinguished by its online nature. Traditionally, machinelearning area has been developing learning algorithms that have certain assumptions on underlyingdistribution of data such as data should have predetermined distribution. Such constraints on the problemdomain lead the way for development of smart learning algorithms performance is theoretically verifiable.Real-word situations are different than this restricted model. Applications usually suffers from problemssuch as unbalanced data distribution. Additionally, data picked from non-stationary environments are alsousual in real world applications, resulting in the “concept drift” which is related with data streamexamples. These issues have been separately addressed by the researchers, also, it is observed that jointproblem of class imbalance and concept drift has got relatively little research. If the final objective ofclever machine learning techniques is to be able to address a broad spectrum of real world applications,then the necessity for a universal framework for learning from and tailoring (adapting) to, environmentwhere drift in concepts may occur and unbalanced data distribution is present can be hardly exaggerated.In this paper, we first present an overview of issues that are observed in stream data mining scenarios,followed by a complete review of recent research in dealing with each of the issue.
机译:近来,流数据挖掘的应用引起了多个研究领域的极大关注。流数据是数据的连续形式,以其在线性质而著称。传统上,机器学习领域一直在开发学习算法,这些算法对数据的基本分布(例如,数据应具有预定的分布)具有某些假设。这种对问题域的约束为智能学习算法性能的发展提供了理论上可验证的途径。实词情况与该受限模型不同。应用程序通常会遇到诸如数据分配不平衡之类的问题。另外,从非平稳环境中选取的数据在现实世界的应用中也很常见,从而导致与数据流示例相关的“概念漂移”。这些问题已经由研究人员分别解决,而且还发现,关于阶级失衡和概念漂移的联合问题,研究相对较少。如果智能机器学习技术的最终目标是能够解决现实世界中广泛的应用问题,那么就需要一个通用的框架来学习和适应(适应)环境,在这种环境中可能会出现概念漂移并且数据分布不平衡的情况。在本文中,我们首先对在流数据挖掘方案中观察到的问题进行概述,然后对有关处理每个问题的最新研究进行全面回顾。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号