首页> 外文会议>Annual ACM-SIAM Symposium on Discrete Algorithms >Which Distribution Distances are Sublinearly Testable?
【24h】

Which Distribution Distances are Sublinearly Testable?

机译:哪个分布距离是载于载体可测试的?

获取原文

摘要

Given samples from an unknown distribution p and a description of a distribution q, are p and q close or far? This question of "identity testing" has received significant attention in the case of testing whether p and q are equal or far in total variation distance. However, in recent work [VV11a, ADK15, DP17], the following questions have been been critical to solving problems at the frontiers of distribution testing: 1. Alternative Distances: Can we test whether p and q are far in other distances, say Hellinger? 2. Tolerance: Can we test when p and q are close, rather than equal? And if so, close in which distances? Motivated by these questions, we characterize the complexity of distribution testing under a variety of distances, including total variation, l_2, Hellinger, Kullback-Leibler, and χ~2. For each pair of distances d_1 and d_2, we study the complexity of testing if p and q are close in d_1 versus far in d_2, with a focus on identifying which problems allow strongly sublinear testers (i.e., those with complexity O(n~(1-γ)) for some γ > 0 where n is the size of the support of the distributions p and q). We provide matching upper and lower bounds for each case. We also study these questions in the case where we only have samples from q (equivalence testing), showing qualitative differences from identity testing in terms of when tolerance can be achieved. Our algorithms fall into the classical paradigm of χ~2-statistics, but require crucial changes to handle the challenges introduced by each distance we consider. Finally, we survey other recent results in an attempt to serve as a reference for the complexity of various distribution testing problems.
机译:给出来自未知分发P的样本和分布Q的描述,是p和q关闭还是远?在测试P和Q是否相等或远远超过总变化距离的情况下,“身份测试”的这个问题受到了重大关注。但是,在最近的工作[VV11A,ADK15,DP17]中,以下问题对于解决分发测试前沿的问题至关重要:1。替代距离:我们可以测试P和Q是否遥远,说Hellinger还是2.耐受性:我们可以在P和Q闭合时测试,而不是相等?如果是这样,请关闭距离?这些问题的激励,我们在各种距离下表征了分配测试的复杂性,包括总体变化,L_2,Hellinger,Kullback-Leibler和χ〜2。对于每对距离D_1和D_2,我们研究了P和Q在D_2上的D_1与D_2相比上的测试复杂性,专注于识别哪些问题允许强大的载体测试仪(即,具有复杂性的那些问题O(N〜( 1-γ)对于一些γ> 0其中n是分布P和Q的支持的尺寸。我们为每种情况提供匹配的上限和下限。我们还在我们只有来自Q(等价测试)的样本的情况下研究这些问题,从而显示可以实现公差时与身份测试的定性差异。我们的算法属于χ〜2统计的经典范式,但需要对我们考虑的每张距离引入的挑战来处理至关重要的变化。最后,我们调查其他最近的结果试图作为各种分布测试问题的复杂性的参考。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号