首页> 外文会议>IEEE Winter Applications and Computer Vision Workshops >A Log-likelihood Regularized KL Divergence for Video Prediction With a 3D Convolutional Variational Recurrent Network
【24h】

A Log-likelihood Regularized KL Divergence for Video Prediction With a 3D Convolutional Variational Recurrent Network

机译:具有3D卷积变分频复制网络的视频预测的日志似然正常化KL发散

获取原文

摘要

The use of latent variable models has shown to be a powerful tool for modeling probability distributions over sequences. In this paper, we introduce a new variational model that extends the recurrent network in two ways for the task of video frame prediction. First, we introduce 3D convolutions inside all modules including the recurrent model for future frame prediction, inputting and outputting a sequence of video frames at each timestep. This enables us to better exploit spatiotemporal information inside the variational recurrent model, allowing us to generate high-quality predictions. Second, we enhance the latent loss of the variational model by introducing a maximum likelihood estimate in addition to the KL divergence that is commonly used in variational models. This simple extension acts as a stronger regularizer in the variational autoencoder loss function and lets us obtain better results and generalizability. Experiments show that our model outperforms existing video prediction methods on several benchmarks while requiring fewer parameters.
机译:使用潜在变量模型已显示是一种强大的工具,用于通过序列建模概率分布。在本文中,我们介绍了一种新的变形模型,以两种方式扩展了经常性网络的视频帧预测的任务。首先,我们在所有模块内引入3D卷积,包括用于未来帧预测的反复模型,在每个时间步骤中输入和输出一系列视频帧。这使我们能够更好地利用变分复制模型内的时空信息,使我们能够产生高质量的预测。其次,除了在变分模型中通常使用的KL发散之外,我们通过引入最大可能性估计来增强变分模型的潜伏损失。这种简单的扩展将作为变形式自动化器丢失功能中的更强的规则器,并让我们获得更好的结果和概括性。实验表明,我们的模型在几个基准测试中优于现有的视频预测方法,同时需要更少的参数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号