首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Bessel Smoothing and Multi-Distribution Property Estimation
【24h】

Bessel Smoothing and Multi-Distribution Property Estimation

机译:贝塞尔平滑和多分配属性估计

获取原文
           

摘要

We consider a basic problem in statistical learning: estimating properties of multiple discrete distributions. Denoting by $Delta_k$ the standard simplex over $[k]:={0,1,ldots, k}$, a property of $d$ distributions is a mapping from $Delta_k^d$ to $mathbb R$. These properties include well-known distribution characteristics such as Shannon entropy and support size ($d=1$), and many important divergence measures between distributions ($d=2$). The primary problem being considered is to learn the property value of an $emph{unknown}$ $d$-tuple of distributions from its sample. The study of such problems dates back to the works of Efron and Thisted (1976b); Thisted and Efron (1987); Good (1953b); Carlton (1969), and has been pushed forward steadily during the past decades. Surprisingly, before our work, the general landscape of this fundamental learning problem was insufficiently understood, and nearly all the existing results are for the special case $dle 2$. Our first main result provides a near-linear-time computable algorithm that, given independent samples from any collection of distributions and for a broad class of multi-distribution properties, learns the property as well as the empirical plug-in estimator that uses samples with logarithmic-factor larger sizes. As a corollary of this, for any $arepsilon>0$ and fixed $din mathbb Z^+$, a $d$-distribution property over $[k]$ that is Lipschitz and additively separable can be learned to an accuracy of $arepsilon$ using a sample of size $mathcal{O}(k/(arepsilon^3sqrt{log k}))$, with high probability. Our second result addresses a closely related problem– tolerant independence testing: One receives samples from the unknown joint and marginal distributions, and attempts to infer the $ell_1$ distance between the joint distribution and the product distribution of the marginals. We show that this testing problem also admits a sample complexity sub-linear in the alphabet sizes, demonstrating the broad applicability of our approach.
机译:我们考虑统计学习中的基本问题:估计多个离散分布的特性。用$ delta_k $ the startal simplex over $ [k]:= {0,1, ldots,k } $,$ d $ distributs的属性是从$ delta_k ^ d $到$ to mathbb r $。这些属性包括众所周知的分发特性,如香农熵和支持尺寸($ d = 1 $),并且分布之间的许多重要发散措施($ d = 2 $)。所考虑的主要问题是从其样本中了解$ emph {Unknown} $ d $ d druple的属性值。这些问题的研究可以追溯到efron和thisted(1976b)的作品;博特和埃氏(1987年);好(1953B);卡尔顿(1969年),并在过去的几十年中稳步推进。令人惊讶的是,在我们的工作之前,这一基本学习问题的一般景观都不充分理解,并且几乎所有现有的结果都是特别案例$ d le 2 $。我们的第一主要结果提供了近线性时间可计算算法,它给出了来自任何分布集和广泛的多分发属性的独立样本,了解属性以及使用样品的实证插件估算对数为较大的尺寸。作为这一点的推论,对于任何$ varepsilon> 0 $和固定$ d in mathbb z ^ + $,$ d $ -distribution属性超过$ [k] $,即LIPSCHITZ,可以学习加剧可分离使用尺寸$ mathcal {o}的样本的$ varepsilon $的准确性(k /( varepsilon ^ 3 sqrt { log k})),具有很高的概率。我们的第二个结果解决了一个密切相关的问题宽容独立性测试:一个人从未知的关节和边缘分布接收样本,并试图推断在联合分布与边缘的产品分布之间的$ ell_1 $距离。我们表明,该测试问题还承认了字母表尺寸的示例复杂性子线性,展示了我们方法的广泛适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号