...
【24h】

Learning from Imbalanced Data

机译:从不平衡数据中学习

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

With the continuous expansion of data availability in many large-scale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decision-making processes. Although existing knowledge discovery and data engineering techniques have shown great success in many real-world applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. In this paper, we provide a comprehensive review of the development of research in learning from imbalanced data. Our focus is to provide a critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario. Furthermore, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potential important research directions for learning from imbalanced data.
机译:随着许多大型,复杂和网络化系统(例如监视,安全性,Internet和财务)中数据可用性的不断扩展,提高对原始数据知识发现和分析的基础理解以支持决策的重要性变得至关重要。制作过程。尽管现有的知识发现和数据工程技术已经在许多实际应用中取得了巨大的成功,但是从不平衡数据中学习的问题(不平衡学习问题)是一个相对较新的挑战,已经引起了学术界和业界的越来越多的关注。不平衡的学习问题与存在代表性不足的数据和严重的班级分布偏差的情况下的学习算法的性能有关。由于不平衡数据集固有的复杂特性,从此类数据中学习需要新的理解,原理,算法和工具,以将大量原始数据有效地转换为信息和知识表示形式。在本文中,我们对从不平衡数据中学习的研究发展进行了全面回顾。我们的重点是对问题的本质,最新技术以及当前用于评估不平衡学习场景下学习成绩的评估指标进行严格的审查。此外,为了激发该领域的未来研究,我们还强调了从不平衡数据中学习的主要机遇和挑战,以及潜在的重要研究方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号