首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Learning Generative Models with Sinkhorn Divergences
【24h】

Learning Generative Models with Sinkhorn Divergences

机译:通过Sinkhorn发散学习生成模型

获取原文
       

摘要

The ability to compare two degenerate probability distributions, that is two distributions supported on low-dimensional manifolds in much higher-dimensional spaces, is a crucial factor in the estimation of generative mod- els.It is therefore no surprise that optimal transport (OT) metrics and their ability to handle measures with non-overlapping sup- ports have emerged as a promising tool. Yet, training generative machines using OT raises formidable computational and statistical challenges, because of (i) the computational bur- den of evaluating OT losses, (ii) their instability and lack of smoothness, (iii) the difficulty to estimate them, as well as their gradients, in high dimension. This paper presents the first tractable method to train large scale generative models using an OT-based loss called Sinkhorn loss which tackles these three issues by relying on two key ideas: (a) entropic smoothing, which turns the original OT loss into a differentiable and more robust quantity that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations with seam- less GPU execution. Additionally, Entropic smoothing generates a family of losses interpolating between Wasserstein (OT) and Energy distance/Maximum Mean Discrepancy (MMD) losses, thus allowing to find a sweet spot leveraging the geometry of OT on the one hand, and the favorable high-dimensional sample complexity of MMD, which comes with un- biased gradient estimates. The resulting computational architecture complements nicely standard deep network generative models by a stack of extra layers implementing the loss function.
机译:比较两个退化概率分布的能力,即在高维空间中低维流形上支持的两个分布,是生成模型估算的关键因素。因此,最优运输量(OT)不足为奇度量标准及其使用非重叠支持处理度量的能力已成为一种有前途的工具。但是,使用OT训练生成机会带来巨大的计算和统计挑战,这是因为(i)评估OT损失的计算负担,(ii)它们的不稳定性和缺乏平滑性,(iii)估计它们的难度也很大作为它们的渐变,在高维度。本文介绍了第一种使用基于OT的损耗(称为Sinkhorn损耗)来训练大规模生成模型的易处理方法,该方法通过以下两个关键思想解决了这三个问题:(a)熵平滑,它将原始OT损耗变成可微分和可以使用Sinkhorn定点迭代计算出的更可靠的量; (b)使用无缝的GPU执行来对这些迭代进行算法(自动)区分。此外,熵平滑生成了一系列在Wasserstein(OT)和能量距离/最大平均差异(MMD)损耗之间插值的损耗,从而一方面找到了利用OT几何形状和最佳高维的最佳结合点MMD的样本复杂度,其中包含无偏梯度估计。最终的计算体系结构通过实现损耗函数的额外层的堆栈很好地补充了标准的深层网络生成模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号