首页> 外文会议>European Conference on Principles of Data Mining and Knowledge Discovery >The Need for Low Bias Algorithms in Classification Learning from Large Data Sets
【24h】

The Need for Low Bias Algorithms in Classification Learning from Large Data Sets

机译:从大数据集中进行分类学习中的低偏差算法

获取原文

摘要

This paper reviews the appropriateness for application to large data sets of standard machine learning algorithms, which were mainly developed in the context of small data sets. Sampling and parallelisation have proved useful means for reducing computation time when learning from large data sets. However, such methods assume that algorithms that were designed for use with what are now considered small data sets are also fundamentally suitable for large data sets. It is plausible that optimal learning from large data sets requires a different type of algorithm to optimal learning from small data sets. This paper investigates one respect in which data set size may affect the requirements of a learning algorithm - the bias plus variance decomposition of classification error. Experiments show that learning from large data sets may be more effective when using an algorithm that places greater emphasis on bias management, rather than variance management.
机译:本文审查了适用于应用于大型数据集的标准机器学习算法,这些算法主要在小数据集的上下文中开发。在从大数据集中学习时,已经证明了采样和平行化已经证明了减少计算时间的有用手段。但是,这些方法假设被设计用于现在被认为是小数据集的算法也基本上适用于大数据集。从大数据集中的最佳学习是合理的,需要一种不同类型的算法来从小数据集最佳学习。本文调查了数据集大小可能影响学习算法的要求的一个尊重 - 分类误差的偏差加方差分解。实验表明,在使用算法时,从大数据集中学习可能更有效地更加强调偏差管理,而不是方差管理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号