FUTUREGAN: ANTICIPATING THE FUTURE FRAMES OF VIDEO SEQUENCES USING SPATIO-TEMPORAL 3D CONVOLUTIONS IN PROGRESSIVELY GROWING GANS

S. Aigner; M. K?rner

首页> 外文期刊>International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences >FUTUREGAN: ANTICIPATING THE FUTURE FRAMES OF VIDEO SEQUENCES USING SPATIO-TEMPORAL 3D CONVOLUTIONS IN PROGRESSIVELY GROWING GANS

【24h】

FUTUREGAN: ANTICIPATING THE FUTURE FRAMES OF VIDEO SEQUENCES USING SPATIO-TEMPORAL 3D CONVOLUTIONS IN PROGRESSIVELY GROWING GANS

机译：ufuelgan：使用逐步增长的GANS中使用时空3D卷积来预测未来的视频序列框架

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce a new encoder-decoder GAN model, FutureGAN, that predicts future frames of a video sequence conditioned on a sequence of past frames. During training, the networks solely receive the raw pixel values as an input, without relying on additional constraints or dataset specific conditions. To capture both the spatial and temporal components of a video sequence, spatio-temporal 3d convolutions are used in all encoder and decoder modules. Further, we utilize concepts of the existing progressively growing GAN (PGGAN) that achieves high-quality results on generating high-resolution single images. The FutureGAN model extends this concept to the complex task of video prediction. We conducted experiments on three different datasets, MovingMNIST, KTH Action, and Cityscapes. Our results show that the model learned representations to transform the information of an input sequence into a plausible future sequence effectively for all three datasets. The main advantage of the FutureGAN framework is that it is applicable to various different datasets without additional changes, whilst achieving stable results that are competitive to the state-of-the-art in video prediction. The code to reproduce the results of this paper is publicly available at https://github.com/TUM-LMF/FutureGAN.

机译：我们介绍了一个新的编码器 - 解码器GaN模型，uuewargan，它预测了在过去帧序列上调节的视频序列的未来帧。在培训期间，网络仅接收原始像素值作为输入，而不依赖于附加约束或数据集特定条件。为了捕获视频序列的空间和时间分量，所有编码器和解码器模块都使用时空3D卷积。此外，我们利用现有的逐步增长的GaN（PGGAN）的概念（PGGAN）实现了高分辨率单图像的高质量结果。未来语言模型将此概念扩展到视频预测的复杂任务。我们在三个不同的数据集，搬家，犹太行动和城市景观中进行了实验。我们的结果表明，模型学习的表示，为所有三个数据集有效地将输入序列的信息转换为合理的未来序列。未来泛乐框架的主要优点是，它适用于各种不同的数据集，而无需额外变化，同时实现对视频预测中最先进的竞争力的稳定结果。重现本文结果的代码在https://github.com/tum-lmf/futuregan上公开可用。

著录项

来源
《International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences》 |2019年第4期|共9页
作者
S. Aigner; M. K?rner;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Smoke Detection on Video Sequences Using 3D Convolutional Neural Networks [J] . Lin Gaohua, Zhang Yongming, Xu Gao, Fire Technology . 2019,第5期

机译：使用3D卷积神经网络对视频序列进行烟雾检测
2. Browsing and exploration of video sequences: A new scheme for key frame extraction and 3D visualization using entropy based Jensen divergence [J] . Qing Xu, Yu Liu, Xiu Li, Information Sciences: An International Journal . 2014,第Null期

机译：视频序列的浏览和探索：使用基于熵的詹森散度进行关键帧提取和3D可视化的新方案
3. Frame-wise detection of relocated I-frames in double compressed H.264 videos based on convolutional neural network [J] . He Peisong, Jiang Xinghao, Sun Tanfeng, Journal of visual communication & image representation . 2017,第octa期

机译：基于卷积神经网络的双压缩H.264视频中重定位I帧的逐帧检测
4. P3D-CTN: Pseudo-3D Convolutional Tube Network for Spatio-Temporal Action Detection in Videos [C] . Jiangchuan Wei, Hanli Wang, Yun Yi, IEEE International Conference on Image Processing . 2019

机译：P3D-CTN：用于视频中时空动作检测的伪3D卷积管网络
5. Near-future Prediction in Videos: Applications in Video Annotation and Frame Reconstruction [D] . ?Mahmud, Tahmida B. 2019

机译：视频近期预测：视频注释和帧重建中的应用
6. Segment-Tube: Spatio-Temporal Action Localization in Untrimmed Videos with Per-Frame Segmentation [O] . Le Wang, Xuhuan Duan, Qilin Zhang, 2018

机译：Segment-Tube：具有按帧分割的未修剪视频中的时空行为本地化
7. Occupational therapy intervention with a child is based upon an understanding and appreciation of normal development. Knowledge of current concepts and theories related to child development is essential when occupational therapist evaluates children. This background information helps therapist to plan intervention for the child. The aim of this study is to make observation video about development of about one year old child. The purpose of my study is to help occupational therapy students learn about child development. My study is practice-based thesis. It includes product, which is the observation video and study rapport. I describe my whole process in my rapport. The process includes different kinds of stages. First, I studied those theories of child development, which are used in the studies of occupational therapy for children. These theories are Moseys Developmental Frame of Reference and the theory of development according to Sensory Integration Theory. These theories are the frames of reference of my study. I organize the child development areas according to child occupations and skills. Then I start to plan, film and edit my video based on the theories of child development and the principles of making a video. In my rapport I describe all the stages of my study and explain the sequence and the content of the stages. I also evaluate the process of my study. In the observation video you can see those stages of development where about one year old child is based on the frames of reference, which I have used in my study. I believe that my observation video can at least be good for inspiring occupational therapy students learning about child development. Keywords child development, learning, observation video [O] . Lehtinen Ann-Mari 2006

机译：对儿童的职业治疗干预基于对正常发育的理解和欣赏。当职业治疗师评估儿童时，与儿童发育相关的当前概念和理论的知识必不可少。这些背景信息可帮助治疗师为孩子计划干预措施。这项研究的目的是制作有关约一岁儿童发育的观察视频。我研究的目的是帮助职业治疗学生学习儿童成长。我的研究是基于实践的论文。它包括产品，这是观察视频和学习融洽的关系。我以融洽的方式描述我的整个过程。该过程包括不同阶段。首先，我研究了有关儿童发育的理论，这些理论被用于儿童的职业治疗研究中。这些理论是Moseys发展参考框架和根据感觉统合理论的发展理论。这些理论是我研究的参考框架。我根据儿童职业和技能组织儿童发展领域。然后，我根据儿童发育理论和视频制作原理开始计划，拍摄和编辑视频。在融洽的关系中，我描述了学习的所有阶段，并解释了这些阶段的顺序和内容。我还评估了我的学习过程。在观察视频中，您可以看到那些发展阶段，其中大约一岁的孩子基于我的研究框架。我相信，我的观察视频至少可以对激发职业治疗的学生学习儿童发育有帮助。关键字儿童发展，学习，观察视频
8. Methods for Selecting a Subsequence of Video Frames from a Sequence of Video Frames. [R] . Liu, T., Kender, J. R. 2004

机译：从一系列视频帧中选择视频帧子序列的方法。

FUTUREGAN: ANTICIPATING THE FUTURE FRAMES OF VIDEO SEQUENCES USING SPATIO-TEMPORAL 3D CONVOLUTIONS IN PROGRESSIVELY GROWING GANS

摘要

著录项

相似文献

相关主题

期刊订阅