首页> 外文会议>International conference on advanced data mining and applications >Logistic Regression Bias Correction for Large Scale Data with Rare Events
【24h】

Logistic Regression Bias Correction for Large Scale Data with Rare Events

机译:具有稀有事件的大规模数据的Logistic回归偏差校正

获取原文

摘要

Logistic regression is a classical classification method, it has been used widely in many applications which have binary dependent variable. However, when the data sets are imbalanced, the probability of rare event is underestimated in the use of traditional logistic regression. With data explosion in recent years, some researchers propose large scale logistic regression which still fails to consider the rare event, therefore, there exists bias when applying their models for large scale data sets with rare events. To address the problems, this paper proposes LRBC method to correct bias of logistic regression for large scale data sets with rare events. Empirical studies compare LRBC with several state-of-the-art algorithms on an actual ad clicking data set. It demonstrates that LRBC method is able to exhibit much better classification performance, and the distributed process for bias correction also scales well.
机译:Logistic回归是一种经典的分类方法,已广泛用于具有二元因变量的许多应用中。但是,当数据集不平衡时,使用传统的逻辑回归法会低估罕见事件的可能性。近年来,随着数据的爆炸式增长,一些研究人员提出了大规模逻辑回归,但仍未考虑稀有事件,因此,在将模型应用于具有稀有事件的大规模数据集时存在偏差。为了解决这些问题,本文提出了LRBC方法来校正具有罕见事件的大规模数据集的逻辑回归偏差。实证研究将LRBC与实际广告点击数据集上的几种最新算法进行了比较。它证明了LRBC方法能够表现出更好的分类性能,并且偏差校正的分布式过程也可以很好地扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号