首页> 外文期刊>Statistics and computing >Approximation and sampling of multivariate probability distributions in the tensor train decomposition
【24h】

Approximation and sampling of multivariate probability distributions in the tensor train decomposition

机译:张量分解中多元概率分布的逼近与采样

获取原文
获取原文并翻译 | 示例

摘要

General multivariate distributions are notoriously expensive to sample from, particularly the high-dimensional posterior distributions in PDE-constrained inverse problems. This paper develops a sampler for arbitrary continuous multivariate distributions that is based on low-rank surrogates in the tensor train format, a methodology that has been exploited for many years for scalable, high-dimensional density function approximation in quantum physics and chemistry. We build upon recent developments of the cross approximation algorithms in linear algebra to construct a tensor train approximation to the target probability density function using a small number of function evaluations. For sufficiently smooth distributions, the storage required for accurate tensor train approximations is moderate, scaling linearly with dimension. In turn, the structure of the tensor train surrogate allows sampling by an efficient conditional distribution method since marginal distributions are computable with linear complexity in dimension. Expected values of non-smooth quantities of interest, with respect to the surrogate distribution, can be estimated using transformed independent uniformly-random seeds that provide Monte Carlo quadrature or transformed points from a quasi-Monte Carlo lattice to give more efficient quasi-Monte Carlo quadrature. Unbiased estimates may be calculated by correcting the transformed random seeds using a Metropolis-Hastings accept/reject step, while the quasi-Monte Carlo quadrature may be corrected either by a control-variate strategy or by importance weighting. We show that the error in the tensor train approximation propagates linearly into the Metropolis-Hastings rejection rate and the integrated autocorrelation time of the resulting Markov chain; thus, the integrated autocorrelation time may be made arbitrarily close to 1, implying that, asymptotic in sample size, the cost per effectively independent sample is one target density evaluation plus the cheap tensor train surrogate proposal that has linear cost with dimension. These methods are demonstrated in three computed examples: fitting failure time of shock absorbers; a PDE-constrained inverse diffusion problem; and sampling from the Rosenbrock distribution. The delayed rejection adaptive Metropolis (DRAM) algorithm is used as a benchmark. In all computed examples, the importance weight-corrected quasi-Monte Carlo quadrature performs best and is more efficient than DRAM by orders of magnitude across a wide range of approximation accuracies and sample sizes. Indeed, all the methods developed here significantly outperform DRAM in all computed examples.
机译:众所周知,一般的多元分布采样非常昂贵,尤其是在PDE约束反问题中的高维后验分布。本文开发了一种基于张量列格式的低秩替代物的任意连续多元分布的采样器,该方法已被用于量子物理学和化学领域中可扩展的高维密度函数逼近的方法。我们基于线性代数中交叉逼近算法的最新发展,使用少量函数评估来构建目标概率密度函数的张量火车逼近。对于足够平滑的分布,准确的张量列车逼近所需的存储量适中,并随尺寸线性缩放。继而,张量列替代的结构允许通过有效的条件分布方法进行采样,因为边际分布是可计算的,并且线性复杂度。可以使用提供蒙特卡罗正交的变换独立均匀随机种子或来自准蒙特卡洛晶格的变换点以提供更有效的准蒙特卡洛,来估计与代理分布有关的非平滑感兴趣量的期望值正交。可以通过使用Metropolis-Hastings接受/拒绝步骤校正转换后的随机种子来计算无偏估计,而准蒙特卡洛正交可以通过控制变量策略或重要性加权来校正。我们表明,张量列车逼近中的误差线性传播到Metropolis-Hastings拒绝率和所产生的马尔可夫链的积分自相关时间中。因此,可以使积分自相关时间任意接近1,这意味着,在样本量渐近的情况下,每个有效独立样本的成本是一种目标密度评估加上具有随成本线性变化的廉价张量火车替代提议。在三个计算示例中演示了这些方法:减震器的安装失效时间; PDE约束的逆扩散问题;并从Rosenbrock分布中取样。延迟拒绝自适应大都会(DRAM)算法用作基准。在所有计算出的示例中,重要性加权校正的准蒙特卡罗正交函数在较宽的逼近精度和样本量范围内,性能最佳,并且比DRAM效率高出几个数量级。实际上,在所有计算示例中,此处开发的所有方法均明显优于DRAM。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号