首页> 外文会议>International conference on advanced data mining and applications >Logistic Regression Bias Correction for Large Scale Data with Rare Events
【24h】

Logistic Regression Bias Correction for Large Scale Data with Rare Events

机译:具有罕见事件的大规模数据的Logistic回归偏置校正

获取原文

摘要

Logistic regression is a classical classification method, it has been used widely in many applications which have binary dependent variable. However, when the data sets are imbalanced, the probability of rare event is underestimated in the use of traditional logistic regression. With data explosion in recent years, some researchers propose large scale logistic regression which still fails to consider the rare event, therefore, there exists bias when applying their models for large scale data sets with rare events. To address the problems, this paper proposes LRBC method to correct bias of logistic regression for large scale data sets with rare events. Empirical studies compare LRBC with several state-of-the-art algorithms on an actual ad clicking data set. It demonstrates that LRBC method is able to exhibit much better classification performance, and the distributed process for bias correction also scales well.
机译:Logistic回归是一种经典分类方法,它已广泛使用在具有二进制相关变量的许多应用程序中。然而,当数据集是不平衡的时,在使用传统的逻辑回归时低估了罕见事件的概率。通过近年来数据爆炸,一些研究人员提出了大规模的逻辑回归,这仍未考虑罕见的事件,因此,在使用罕见事件的大规模数据集应用程序时存在偏差。为了解决问题,本文提出了LRBC方法,以纠正具有罕见事件的大规模数据集的逻辑回归偏差。实证研究将LRBC与几种最先进的算法进行了实际的广告点击数据集。它展示了LRBC方法能够表现出更好的分类性能,并且偏置校正的分布式过程也很好地缩放。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号