首页> 外文会议>IEEE Annual Symposium on Foundations of Computer Science >Learning Sums of Independent Integer Random Variables
【24h】

Learning Sums of Independent Integer Random Variables

机译:独立整数随机变量的学习和

获取原文

摘要

Let bS = bX_1 + ·s + bX_n be a sum of n independent integer random variables bX_i, where each bX_i is supported on 0, 1, ·, k-1 but otherwise may have an arbitrary distribution (in particular the bX_i's need not be identically distributed). How many samples are required to learn the distribution bS to high accuracy? In this paper we show that the answer is completely independent of n, and moreover we give a computationally efficient algorithm which achieves this low sample complexity. More precisely, our algorithm learns any such bS to ε-accuracy (with respect to the total variation distance between distributions) using poly(k, 1/ε) samples, independent of n. Its running time is poly(k, 1/ε) in the standard word RAM model. Thus we give a broad generalization of the main result of DDS12stoc which gave a similar learning result for the special case k=2 (when the distribution bS is a Poisson Binomial Distribution). Prior to this work, no nontrivial results were known for learning these distributions even in the case k=3. A key difficulty is that, in contrast to the case of k = 2, sums of independent 0, 1, 2-valued random variables may behave very differently from (discretized) normal distributions, and in fact may be rather complicated - they are not log-concave, they can be θ(n)-modal, there is no relationship between Kolmogorov distance and total variation distance for the class, etc. Nevertheless, the heart of our learning result is a new limit theorem which characterizes what the sum of an arbitrary number of arbitrary independent 0, 1, ·, k-1-valued random variables may look like. Previous limit theorems in this setting made strong assumptions on the "shift invariance" of the random variables bX_i in order to force a discretized normal limit. We believe that our new limit theorem, as the first result for truly arbitrary sums of independent 0, - , ·, k-1-valued random variables, is of independent interest.
机译:令bS = bX_1 +·s + bX_n是n个独立整数随机变量bX_i的和,其中每个bX_i在0、1,·,k-1上受支持,但否则可以具有任意分布(特别是bX_i不必为完全相同)。要获得高精度的分布bS,需要多少个样本?在本文中,我们证明了答案完全独立于n,此外,我们给出了一种计算效率高的算法,该算法可实现较低的样本复杂度。更准确地说,我们的算法使用poly(k,1 / eps)样本,独立于n来学习任何这样的bS到ε准确性(相对于分布之间的总变化距离)。在标准word RAM模型中,其运行时间为poly(k,1 /ε)。因此,我们对DDS12stoc的主要结果进行了广泛的概括,对于特殊情况k = 2(当分布bS是泊松二项式分布时)给出了相似的学习结果。在进行这项工作之前,即使在k = 3的情况下,也没有非平凡的结果可用来学习这些分布。关键困难在于,与k = 2的情况相反,独立的0、1、2值随机变量的总和可能与(离散的)正态分布有很大不同,并且实际上可能相当复杂-它们不是对数凹,它们可以是θ(n)-模态,该类的Kolmogorov距离与总变化距离之间没有关系等。尽管如此,我们的学习结果的核心是一个新的极限定理,该定理表征了任意数量的任意独立0、1,·,k-1值随机变量可能看起来像。在这种情况下,先前的极限定理对随机变量bX_i的“位移不变性”进行了强有力的假设,以强制离散化的正常极限。我们相信,作为新的极限定理,是对独立的0,-,·,k-1值随机变量的真正任意和的第一个结果,它具有独立的意义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号