首页> 外文期刊>Pattern recognition letters >Imbalance-XGBoost: leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost
【24h】

Imbalance-XGBoost: leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost

机译:Imbalance-xgboost:利用XGBoost的二元标签 - 不平衡分类的加权和焦点损失

获取原文
获取原文并翻译 | 示例
       

摘要

The paper presents Imbalance-XGBoost, a Python package that combines the powerful XGBoost software with weighted and focal losses to tackle binary label-imbalanced classification tasks. Though a small-scale program in terms of size, the package is, to the best of our knowledge, the first of its kind which provides an integrated implementation for the two loss functions on XGBoost and brings a general-purpose extension to XGBoost for label-imbalanced scenarios. In this paper, the design and usage of the package are discussed and illustrated with examples. Furthermore, as the first- and second-order derivatives of the loss functions are essential for the implementations, the algebraic derivation is discussed and it can be deemed as a separate contribution. The performances of the methods implemented in the package are extensively evaluated on Parkinson's disease classification dataset, and multiple competitive performances are presented with the ROC and Precision-Recall (PR) curves. To further assert the superiority of the methods, the performances on four other benchmark datasets from the UCI machine learning repository are additionally reported. Given the scalable nature of XGBoost, the package has great potentials to be broadly applied to real-life binary classification tasks, which are usually of large-scale and label-imbalanced.
机译:本文介绍了Imbalance-XGBoost,一个Python包,将功能强大的XGBoost软件与加权和焦点损失结合起来以解决二进制标签 - 不平衡的分类任务。虽然在规模方面,但是,据我们所知,据我们所知,据我们所知,它的第一个为XGBoost上的两个损耗函数提供了集成实现,并为标签带来了通用的扩展到XGBoost - 管理场景。在本文中,讨论并示出了包装的设计和用法。此外,随着损耗函数的第一和二阶衍生物对实现是必不可少的,讨论了代数推导,并且可以被认为是单独的贡献。在包装中实现的方法的性能在帕金森病的疾病分类数据集上进行广泛评估,并使用ROC和Precision-Recall(PR)曲线提出了多种竞争性能。为了进一步断言方法的优越性,还报告了来自UCI机器学习存储库的四个其他基准数据集的表现。鉴于XGBoost的可扩展性,包装具有很大的潜力,可以广泛应用于现实生活二进制分类任务,通常具有大规模和标签 - 不平衡。

著录项

  • 来源
    《Pattern recognition letters》 |2020年第8期|190-197|共8页
  • 作者单位

    Department of Computer Science. Rutgers University Piscataway NJ 08854 USA;

    Department of Computer Science. Rutgers University Piscataway NJ 08854 USA;

    Department of Health Statistics Weifang Medical University Weifang Shandong 261053 China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Imbalanced classification; XGBoost; Python package;

    机译:不平衡分类;XGBoost;Python包;
  • 入库时间 2022-08-18 21:28:45

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号