首页> 外文会议>Annual Conference on Neural Information Processing Systems >FilterBoost: Regression and Classification on LargeDatasets
【24h】

FilterBoost: Regression and Classification on LargeDatasets

机译:FilterBoost:大数据集的回归和分类

获取原文

摘要

We study boosting in the filtering setting, where the booster draws examples from an oracle instead of using a fixed training set and so may train efficiently on very large datasets. Our algorithm, which is based on a logistic regression technique proposed by Collins, Schapire, & Singer, requires fewer assumptions to achieve bounds equivalent to or better than previous work. Moreover, we give the first proof that the algorithm of Collins et al. is a strong PAC learner, albeit within the filtering setting. Our proofs demonstrate the algorithm's strong theoretical properties for both classification and conditional probability estimation, and we validate these results through extensive experiments. Empirically, our algorithm proves more robust to noise and overfitting than batch boosters in conditional probability estimation and proves competitive in classification.
机译:我们研究了过滤设置中的增强,其中增强器从oracle中提取示例,而不是使用固定的训练集,因此可以在非常大的数据集上进行有效的训练。我们的算法基于Collins,Schapire和Singer提出的逻辑回归技术,需要较少的假设才能达到与先前的工作相当或更好的界限。此外,我们给出了Collins等人的算法的第一个证明。尽管在过滤设置内,但他还是PAC学习者中的佼佼者。我们的证明证明了该算法在分类和条件概率估计方面的强大理论特性,并通过大量实验验证了这些结果。从经验上讲,我们的算法在条件概率估计方面比批处理增强器对噪声和过度拟合的鲁棒性更强,并且在分类方面具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号