首页> 外文期刊>Indian Journal of Science and Technology >Improving Classification Accuracy based on Random Forest Model through Weighted Sampling for Noisy Data with Linear Decision Boundary
【24h】

Improving Classification Accuracy based on Random Forest Model through Weighted Sampling for Noisy Data with Linear Decision Boundary

机译:线性决策边界的噪声数据加权采样提高基于随机森林模型的分类精度

获取原文
获取外文期刊封面目录资料

摘要

Background: Random forest algorithms tend to use a simple random sampling of observations in building their decision trees. The random selection has the chance for noisy, outlier and non informative data to take place during the construction of trees. This leads to inappropriate and poor ensemble classification decision. This paper aims to optimize, the sample selection through probability proportional to size sampling (weighted sampling) in which the noisy, outlier and non informative data points are down weighted to improve the classification accuracy of the model. Methods: The weights of each data point is determined in two aspects, finding each data point influence on the model through Leave-One-Out method using a single classification tree and measuring the deviance residual of each data point using logistic regression model, these are combined as the final weight. Results: The proposed Finest Random Forest (FRF) performs consistently better than the conventional Random Forest (RF) in terms of classification accuracy. Conclusion: The classification accuracy is improved when random forest is composed with probability proportional to size sampling (weighted sampling) for noisy data with linear decision boundary.
机译:背景:随机森林算法倾向于在构建决策树时使用简单的随机观察值采样。随机选择有可能在树木建造过程中发生嘈杂,离群和无信息的数据。这导致不合适的和不良的整体分类决策。本文旨在通过与大小抽样(加权抽样)成正比​​的概率来优化样本选择,其中对嘈杂的,离群的和无信息的数据点进行加权加权以提高模型的分类准确性。方法:从两个方面确定每个数据点的权重,使用单个分类树通过“留一法”找出每个数据点对模型的影响,并使用逻辑回归模型测量每个数据点的偏差残差,这些是合并为最终重量。结果:就分类准确性而言,拟议的最优质随机森林(FRF)的性能始终优于常规随机森林(RF)。结论:采用线性决策边界对噪声数据进行概率密度与大小抽样(加权抽样)成比例的随机森林组合,可以提高分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号