...
首页> 外文期刊>International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences >FUTUREGAN: ANTICIPATING THE FUTURE FRAMES OF VIDEO SEQUENCES USING SPATIO-TEMPORAL 3D CONVOLUTIONS IN PROGRESSIVELY GROWING GANS
【24h】

FUTUREGAN: ANTICIPATING THE FUTURE FRAMES OF VIDEO SEQUENCES USING SPATIO-TEMPORAL 3D CONVOLUTIONS IN PROGRESSIVELY GROWING GANS

机译:ufuelgan:使用逐步增长的GANS中使用时空3D卷积来预测未来的视频序列框架

获取原文
           

摘要

We introduce a new encoder-decoder GAN model, FutureGAN, that predicts future frames of a video sequence conditioned on a sequence of past frames. During training, the networks solely receive the raw pixel values as an input, without relying on additional constraints or dataset specific conditions. To capture both the spatial and temporal components of a video sequence, spatio-temporal 3d convolutions are used in all encoder and decoder modules. Further, we utilize concepts of the existing progressively growing GAN (PGGAN) that achieves high-quality results on generating high-resolution single images. The FutureGAN model extends this concept to the complex task of video prediction. We conducted experiments on three different datasets, MovingMNIST, KTH Action, and Cityscapes. Our results show that the model learned representations to transform the information of an input sequence into a plausible future sequence effectively for all three datasets. The main advantage of the FutureGAN framework is that it is applicable to various different datasets without additional changes, whilst achieving stable results that are competitive to the state-of-the-art in video prediction. The code to reproduce the results of this paper is publicly available at https://github.com/TUM-LMF/FutureGAN.
机译:我们介绍了一个新的编码器 - 解码器GaN模型,uuewargan,它预测了在过去帧序列上调节的视频序列的未来帧。在培训期间,网络仅接收原始像素值作为输入,而不依赖于附加约束或数据集特定条件。为了捕获视频序列的空间和时间分量,所有编码器和解码器模块都使用时空3D卷积。此外,我们利用现有的逐步增长的GaN(PGGAN)的概念(PGGAN)实现了高分辨率单图像的高质量结果。未来语言模型将此概念扩展到视频预测的复杂任务。我们在三个不同的数据集,搬家,犹太行动和城市景观中进行了实验。我们的结果表明,模型学习的表示,为所有三个数据集有效地将输入序列的信息转换为合理的未来序列。未来泛乐框架的主要优点是,它适用于各种不同的数据集,而无需额外变化,同时实现对视频预测中最先进的竞争力的稳定结果。重现本文结果的代码在https://github.com/tum-lmf/futuregan上公开可用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号