首页> 外文会议>IEEE International Conference on Advanced Computational Intelligence >Scalable learning and knowledge discovery via adaptive sampling
【24h】

Scalable learning and knowledge discovery via adaptive sampling

机译:通过自适应采样可扩展的学习和知识发现

获取原文

摘要

Scalability is an important issue in data mining and knowledge discovery in real-world applications where hugedata sets often render ordinary learning algorithms infeasible. As an important technique for parameter estimation and hypothesis testing widely used in statistical analysis, random sampling can be exploited to address the issue of scalable learning and knowledge discovery. Adaptive sampling is typically more efficient than traditional batch sampling methods because it can determine the sample size based on the samples seen so far. Recently a new adaptive sampling method for estimating the mean of a Bernoulli random variable was proposed in [2], which was empirically shown to require significantly lower sample size (i.e., the number of sampled instances) while maintaining competitive accuracy and confidence when compared with existing approaches. This paper presents theoretical analysis of properties of the proposed sampling method, as well as a brief outline on how to utilize the proposed sampling method to develop an efficient ensemble learning method with Boosting.
机译:可扩展性是数据挖掘和知识发现中的一个重要问题,在Hugedata集合往往使普通学习算法变得不可行。作为参数估计和假设检测的重要技术在统计分析中广泛应用,可以利用随机抽样来解决可扩展学习和知识发现的问题。自适应采样通常比传统的批量采样方法更有效,因为它可以基于到目前为止所见的样本来确定样本大小。最近,在[2]中提出了一种新的自适应采样方法,用于估计BERNOULLI随机变量的平均值,凭经验显示,需要显着降低样本大小(即,采样的实例的数量),同时保持竞争精度和信心现有方法。本文提出了建议采样方法的性质的理论分析,以及如何利用所提出的采样方法来开发具有提升的高效集合学习方法的简要概述。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号