首页> 外文会议>2012 IEEE Fifth International Conference on Advanced Computational Intelligence. >Scalable learning and knowledge discovery via adaptive sampling
【24h】

Scalable learning and knowledge discovery via adaptive sampling

机译:通过自适应采样进行可扩展的学习和知识发现

获取原文
获取原文并翻译 | 示例

摘要

Scalability is an important issue in data mining and knowledge discovery in real-world applications where hugedata sets often render ordinary learning algorithms infeasible. As an important technique for parameter estimation and hypothesis testing widely used in statistical analysis, random sampling can be exploited to address the issue of scalable learning and knowledge discovery. Adaptive sampling is typically more efficient than traditional batch sampling methods because it can determine the sample size based on the samples seen so far. Recently a new adaptive sampling method for estimating the mean of a Bernoulli random variable was proposed in [2], which was empirically shown to require significantly lower sample size (i.e., the number of sampled instances) while maintaining competitive accuracy and confidence when compared with existing approaches. This paper presents theoretical analysis of properties of the proposed sampling method, as well as a brief outline on how to utilize the proposed sampling method to develop an efficient ensemble learning method with Boosting.
机译:可伸缩性是现实应用程序中数据挖掘和知识发现中的重要问题,在这些应用程序中,巨大的数据集通常使普通的学习算法不可行。作为统计分析中广泛使用的参数估计和假设检验的重要技术,可以利用随机采样解决可扩展的学习和知识发现问题。自适应采样通常比传统的批量采样方法更有效,因为它可以根据到目前为止看到的样本确定样本大小。最近,在[2]中提出了一种新的自适应抽样方法,用于估计伯努利随机变量的均值,根据经验表明,与现有方法。本文介绍了所提出的采样方法的性质的理论分析,并简要概述了如何利用所提出的采样方法来开发具有Boosting的高效集成学习方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号