首页> 外文会议>IEEE Annual Symposium on Foundations of Computer Science >Learning Sums of Independent Random Variables with Sparse Collective Support
【24h】

Learning Sums of Independent Random Variables with Sparse Collective Support

机译:具有稀疏集体支持的独立随机变量的学习和

获取原文

摘要

We study the learnability of sums of independent integer random variables given a bound on the size of the union of their supports. For a A ⊂Z+ ubset A of non-negative integers, a sum of independent random variables with collective support A (called an "A-sum" in this paper) is a distribution S = X1 + ... + XN where the Xi's are mutually independent (but not necessarily identically distributed) integer random variables all of whose supports are contained in A. We give two main algorithmic results for learning such distributions: 1) For the case |A|=3, we give an algorithm for learning A-sums to accuracy ε that uses poly(1/ε) samples and runs in time poly(1/ε), independent of N and of the elements of A. 2) For an arbitrary constant k>=4, if A = {a1,...,ak} with 0<;=a1 <; ... <; ak, we give an algorithm that uses poly(1/ε)*log log ak samples (independent of N) and runs in time poly(1/ε, log ak). We prove an essentially matching lower bound: if |A| = 4, then any algorithm must use Ω(log log a4) samples even for learning to constant accuracy. We also give similar-in-spirit (but quantitatively very different) algorithmic results, and essentially matching lower bounds, for the case in which A is not known to the learner. Our learning algorithms employ new limit theorems which may be of independent interest. Our algorithms and lower bounds together settle the question of how the sample complexity of learning sums of independent integer random variables scales with the elements in the union of their supports, both in the known-support and unknown-support settings. Finally, all our algorithms easily extend to the "semi-agnostic" learning model, in which training data is generated from a distribution that is only c*ε-close to some A-sum for a constant c>0.
机译:我们研究给定其支持的并集大小的界限的独立整数随机变量之和的可学习性。对于A⊂Z + 非负整数ubset A,具有集体支持A的独立随机变量之和(在本文中称为“ A-sum”)为分布S = X 1 + ... + X N X在哪里 i 的是相互独立的(但不一定是相同分布的)整数随机变量,它们的所有支持都包含在A中。我们给出了学习这种分布的两个主要算法结果:1)对于| A | = 3的情况,我们给出了一种算法用于学习使用poly(1 /ε)样本并按时间poly(1 /ε)运行的,精度为ε的A-sum,与N和A的元素无关。2)对于任意常数k> = 4,如果A = {a 1 ,...,一种 k }与0 <; = a 1 <; ... <;一种 k ,我们给出了使用poly(1 /ε)* log log a的算法 k 样本(与N无关)并按时间poly(1 /ε,运行a k )。我们证明了一个基本匹配的下界:如果| A | = 4,则任何算法都必须使用Ω(log log a 4 )样本,甚至可以学习到恒定的精度。对于学习者不知道A的情况,我们还给出了类似的算法(但在数量上非常不同)的算法结果,并且基本匹配了下界。我们的学习算法采用了新的极限定理,这些定理可能是独立引起的。我们的算法和下界共同解决了一个问题,即在已知支持和未知支持设置下,独立整数随机变量的学习总和的样本复杂度如何随其支持的并集中的元素成比例。最后,我们所有的算法都可以轻松地扩展到“半不可知论”的学习模型,在该模型中,训练数据从仅c *ε-接近某个A-sum的分布中生成,且常数c> 0。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号