【24h】

Impure Decision Trees for Auc and Log loss optimization

机译:用于Auc和对数损失优化的不正确决策树

获取原文
获取原文并翻译 | 示例

摘要

Decision Tree is one of the most popular supervised Machine Learning algorithms; it is also the easiest to understand. But finding an optimal decision tree for a given data is a harder task and the use of multiple performance metrics adds some complexity to the problem of selecting the most appropriate DT. The most used evaluation metrics are Log loss and Auc, but they are penalized against zero and one predictions. Several techniques have been proposed to deal with this issue among them Laplace Correction which adjusts probability estimates at the leaf nodes by avoiding purity nodes after tree construction. In this paper, we propose a new adjustment of the probability estimate by avoiding the creation of pure nodes during the construction of the tree rather than after its construction. This approach aims at improving the performance of the model for the log loss and auc metrics. We present experiments suggesting that our method leads to find better performance regardless of the used metric.
机译:决策树是最流行的监督式机器学习算法之一;这也是最容易理解的。但是,找到给定数据的最佳决策树是一项艰巨的任务,并且使用多个性能指标会增加选择最合适的DT的难度。最常用的评估指标是对数损失和Auc,但对零和一的预测会受到惩罚。已经提出了几种技术来解决这个问题,其中包括拉普拉斯校正(Laplace Correction),其通过在树构造之后避免纯度节点来调整叶节点处的概率估计。在本文中,我们提出了一种新的概率估计调整方法,即避免在树的构造过程中而不是在树的构造之后创建纯节点。该方法旨在提高对数损失和auc指标的模型性能。我们目前的实验表明,无论使用哪种指标,我们的方法都能找到更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号