...
首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >RBoost: Label Noise-Robust Boosting Algorithm Based on a Nonconvex Loss Function and the Numerically Stable Base Learners
【24h】

RBoost: Label Noise-Robust Boosting Algorithm Based on a Nonconvex Loss Function and the Numerically Stable Base Learners

机译:RBoost:基于非凸损失函数和数值稳定基础学习器的标签增强鲁棒性算法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

AdaBoost has attracted much attention in the machine learning community because of its excellent performance in combining weak classifiers into strong classifiers. However, AdaBoost tends to overfit to the noisy data in many applications. Accordingly, improving the antinoise ability of AdaBoost plays an important role in many applications. The sensitiveness to the noisy data of AdaBoost stems from the exponential loss function, which puts unrestricted penalties to the misclassified samples with very large margins. In this paper, we propose two boosting algorithms, referred to as RBoost1 and RBoost2, which are more robust to the noisy data compared with AdaBoost. RBoost1 and RBoost2 optimize a nonconvex loss function of the classification margin. Because the penalties to the misclassified samples are restricted to an amount less than one, RBoost1 and RBoost2 do not overfocus on the samples that are always misclassified by the previous base learners. Besides the loss function, at each boosting iteration, RBoost1 and RBoost2 use numerically stable ways to compute the base learners. These two improvements contribute to the robustness of the proposed algorithms to the noisy training and testing samples. Experimental results on the synthetic Gaussian data set, the UCI data sets, and a real malware behavior data set illustrate that the proposed RBoost1 and RBoost2 algorithms perform better when the training data sets contain noisy data.
机译:由于AdaBoost在将弱分类器组合为强分类器方面表现出色,因此在机器学习社区中引起了广泛关注。但是,在许多应用程序中,AdaBoost往往过度适合嘈杂的数据。因此,提高AdaBoost的抗噪能力在许多应用中起着重要作用。对AdaBoost噪声数据的敏感度来自于指数损失函数,该函数对误差很大的错误分类样本施加了不受限制的惩罚。在本文中,我们提出了两种增强算法,分别称为RBoost1和RBoost2,与AdaBoost相比,它们对噪声数据的鲁棒性更高。 RBoost1和RBoost2优化了分类余量的非凸损失函数。由于对错误分类的样本的处罚被限制为少于一的数量,因此RBoost1和RBoost2不会过度关注以前的基础学习者总是错误分类的样本。除了损失函数外,在每次增强迭代时,RBoost1和RBoost2都使用数值稳定的方法来计算基础学习器。这两个改进有助于所提出算法对嘈杂的训练和测试样本的鲁棒性。对合成高斯数据集,UCI数据集和真实的恶意软件行为数据集的实验结果表明,当训练数据集包含嘈杂数据时,建议的RBoost1和RBoost2算法的性能更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号