首页> 外文学位 >GA-Boost: a genetic algorithm for robust boosting.
【24h】

GA-Boost: a genetic algorithm for robust boosting.

机译:GA-Boost:用于稳健提升的遗传算法。

获取原文
获取原文并翻译 | 示例

摘要

Many simple and complex methods have been developed to solve the classification problem. Boosting is one of the best known techniques for improving the prediction accuracy of classification methods, but boosting is sometimes prone to overfit and the final model is difficult to interpret. Some boosting methods, including Adaboost, are very sensitive to outliers. Many researchers have contributed to resolving boosting problems, but those problems are still remaining as hot issues.;We introduce a new boosting algorithm "GA-Boost" which directly optimizes weak learners and their associated weights using a genetic algorithm, and three extended versions of GA-Boost. The genetic algorithm utilizes a new penalized fitness function that consists of three parameters (a, b, and p) which limit the number of weak classifiers (by b) and control the effects of outliers (by a ) to maximize an appropriately chosen p-th percentile of margins. We evaluate GA-Boost performance with an experimental design and compare it to AdaBoost using several artificial and real-world data sets from the UC-Irvine Machine Learning Repository.;In experiments, GA-Boost was more resistant to outliers and resulted in simpler predictive models than AdaBoost. GA-Boost can be applied to data sets with three different weak classifier options. We introduce three extended versions of GA-Boost, which performed very well on two simulation data sets and three real world data sets.
机译:已经开发出许多简单和复杂的方法来解决分类问题。 Boosting是提高分类方法的预测准确性的最著名技术之一,但是Boosting有时容易过度拟合,最终模型难以解释。一些增强方法(包括Adaboost)对异常值非常敏感。许多研究人员为解决加速问题做出了贡献,但是这些问题仍然是热门问题。;我们引入了一种新的加速算法“ GA-Boost”,该算法使用遗传算法直接优化弱学习者及其相关权重,并扩展了三种GA-Boost。遗传算法利用了一个新的惩罚适应度函数,该函数包含三个参数(a,b和p),这些参数限制了弱分类器的数量(按b)并控制离群值的影响(按a)以最大化适当选择的p-利润率的百分之一。我们使用实验设计评估GA-Boost的性能,并使用UC-Irvine机器学习存储库中的一些人工和真实数据集将其与AdaBoost进行比较;在实验中,GA-Boost对异常值的抵抗力更强,并且预测更简单型号比AdaBoost高。 GA-Boost可以应用于具有三种不同弱分类器选项的数据集。我们介绍了GA-Boost的三个扩展版本,它们在两个模拟数据集和三个现实数据集上的表现都非常好。

著录项

  • 作者

    Oh, Dong-Yop.;

  • 作者单位

    The University of Alabama.;

  • 授予单位 The University of Alabama.;
  • 学科 Statistics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 146 p.
  • 总页数 146
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号