首页> 外文OA文献 >The interaction of sampling ratio and modelling method in prediction of binary target with rare target class
【2h】

The interaction of sampling ratio and modelling method in prediction of binary target with rare target class

机译:稀有目标类别二值目标预测中采样率与建模方法的相互作用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In many practical predictive data mining problems with a binary target, one of the targetudclasses is rare. In such a situation it is common practice to decrease the ratio of common toudrare class cases in the training set by under-sampling the common class. The relationshipudbetween the ratio of common to rare class cases in the training set and model performanceudwas investigated empirically on three artificial and three real-world data sets. The resultsudindicated that a flexible modelling method without regularisation benefits in both mean andudvariance of performance from a larger ratio when evaluated on a criterion sensitive toudoverfitting, and benefits in mean but not variance of performance when evaluated on audcriterion less sensitive to overfitting. For an inflexible modelling method and a flexibleudmethod with regularisation, the effects of a larger ratio were less consistent. In noudcircumstances, however, was a larger ratio found to be detrimental to model performance,udhowever measured.
机译:在许多具有二进制目标的实际预测数据挖掘问题中,很少有目标 udclasss之一。在这种情况下,通常的做法是通过对普通班级进行低采样来减少培训中普通班/超常班级案例的比率。在三个人工数据集和三个真实数据集上,通过实证研究了训练集中常见案例与稀有案例的比率与模型性能之间的关系。结果表明没有正则化的灵活建模方法,在对过拟合敏感的标准上进行评估时,性能均值和方差从较大的比率中受益,而在对指标进行过评估时,性能的均值但方差却没有益处对过度拟合敏感。对于不灵活的建模方法和带有规则化的灵活 udmethod,较大比例的效果不太一致。但是,无论如何,在任何情况下都不会发现较大的比例对模型性能有害。

著录项

  • 作者

    Hirschowitz Steven;

  • 作者单位
  • 年度 2009
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号