首页> 外文会议>Annual American Control Conference >Consistent Online Gaussian Process Regression Without the Sample Complexity Bottleneck
【24h】

Consistent Online Gaussian Process Regression Without the Sample Complexity Bottleneck

机译:一致的在线高斯过程回归,没有样本复杂性瓶颈

获取原文

摘要

Gaussian process regression provides a framework for nonlinear nonparametric Bayesian inference applicable across machine learning, robotics, chemical engineering, and other settings. Unfortunately, the computational burden of the posterior mean and covariance scales cubically with the training sample size. Even worse, in the online setting where samples perpetually arrive, this complexity approaches infinity. Thus, popular perception is that Gaussian processes cannot be used with streaming data, and that approximations are required. Motivated by this necessity, we develop the first compression sub-routine for online Gaussian processes that preserves their convergence to the population posterior, i.e., asymptotic posterior consistency, while ameliorating their intractable complexity growth with the sample size. We do so by after each sequential Bayesian update, fixing an error neighborhood with respect to the Hellinger metric centered at the current empirical probability measure, and greedily tossing out past kernel dictionary elements until we hit the boundary of this neighborhood. We call the resulting method Parsimonious Online Gaussian Processes (POG). When we set the error radius, or compression budget, go to null with the sample size, then exact asymptotic consistency is preserved (Theorem li) at the cost of unbounded memory in the limit. On the other hand, for constant compression budget, POG converges to a neighborhood of the population posterior distribution (Theorem 1ii) but with finite memory that is at-worst determined by the metric entropy of the feature space (Theorem 2). Experiments on benchmark data demonstrates that POG exhibits favorable performance in practice.
机译:高斯过程回归为非线性非参数贝叶斯推理提供了一个框架,该框架适用于机器学习,机器人技术,化学工程和其他设置。不幸的是,后均值和协方差的计算负担与训练样本量成正比。更糟糕的是,在样品永久到达的在线环境中,这种复杂性接近无限。因此,普遍的看法是高斯过程不能与流数据一起使用,并且需要近似值。基于这种必要性,我们为在线高斯过程开发了第一个压缩子例程,该例程保留了它们与总体后验的收敛性,即渐近后验的一致性,同时随着样本量的增加缓解了其难处理的复杂性增长。为此,我们在每次连续的贝叶斯更新之后,相对于以当前经验概率测度为中心的Hellinger度量固定一个错误邻域,然后贪婪地抛出过去的内核字典元素,直到到达该邻域的边界。我们称这种方法为简约在线高斯过程(POG)。当我们设置误差半径或压缩预算时,将样本大小设置为零,然后保留精确的渐近一致性(定理li),但以无限制的内存为代价。另一方面,对于恒定的压缩预算,POG收敛到总体后验分布的一个邻域(定理1ii),但具有有限的记忆,该记忆最明显地由特征空间的度量熵确定(定理2)。在基准数据上进行的实验表明,POG在实践中表现出良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号