首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Sample Complexity of Sinkhorn Divergences
【24h】

Sample Complexity of Sinkhorn Divergences

机译:Sinkhorn散度的样本复杂度

获取原文
           

摘要

Optimal transport (OT) and maximum mean discrepancies (MMD) are now routinely used in machine learning to compare probability measures. We focus in this paper on Sinkhorn divergences (SDs), a regularized variant of OT distances which can interpolate, depending on the regularization strength $arepsilon$, between OT ($arepsilon=0$) and MMD ($arepsilon=infty$). Although the tradeoff induced by that regularization is now well understood computationally (OT, SDs and MMD require respectively $O(n^3log n)$, $O(n^2)$ and $n^2$ operations given a sample size $n$), much less is known in terms of their sample complexity, namely the gap between these quantities, when evaluated using finite samples vs. their respective densities. Indeed, while the sample complexity of OT and MMD stand at two extremes, $1^{1/d}$ for OT in dimension $d$ and $1/sqrt{n}$ for MMD, that for SDs has only been studied empirically. In this paper, we (i) derive a bound on the approximation error made with SDs when approximating OT as a function of the regularizer $arepsilon$, (ii) prove that the optimizers of regularized OT are bounded in a Sobolev (RKHS) ball independent of the two measures and (iii) provide the first sample complexity bound for SDs, obtained,by reformulating SDs as a maximization problem in a RKHS. We thus obtain a scaling in $1/sqrt{n}$ (as in MMD), with a constant that depends however on $arepsilon$, making the bridge between OT and MMD complete.
机译:现在,机器学习中通常使用最佳运输(OT)和最大平均差异(MMD)来比较概率测度。我们在本文中重点研究Sinkhorn散度(SD),这是OT距离的正规化变体,可以根据OT($ varepsilon = 0 $)和MMD($ varepsilon = infty $)。尽管现在已经很好地理解了由该正则化引起的折衷(在给定样本的情况下,OT,SD和MMD分别需要$ O(n ^ 3 log n)$,$ O(n ^ 2)$和$ n ^ 2 $操作如果使用有限样本与各自的密度进行评估,则样本的复杂性(即这些数量之间的差距)就少得多。的确,虽然OT和MMD的样本复杂度处于两个极端,但对于维度d $$的OT,$ 1 / n ^ {1 / d} $和对于MMD的$ 1 / sqrt {n} $,SD只是根据经验进行研究。在本文中,我们(i)得出将OT近似为正则化函数$ varepsilon $的函数时,SD所产生的近似误差的界,(ii)证明正则化OT的优化器在Sobolev(RKHS)中有界球独立于这两个量度,并且(iii)提供了第一个样本SD的复杂度界限,通过将SD重新格式化为RKHS中的最大化问题而获得。因此,我们获得了$ 1 / sqrt {n} $的缩放比例(如在MMD中一样),但是该常数取决于$ varepsilon $,从而使OT和MMD之间的桥梁完整。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号