...
首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Recurrent Temporal Aggregation Framework for Deep Video Inpainting
【24h】

Recurrent Temporal Aggregation Framework for Deep Video Inpainting

机译:深度视频侵略的经常性时间聚合框架

获取原文
获取原文并翻译 | 示例
           

摘要

Video inpainting aims to fill in spatio-temporal holes in videos with plausible content. Despite tremendous progress on deep learning-based inpainting of a single image, it is still challenging to extend these methods to video domain due to the additional time dimension. In this paper, we propose a recurrent temporal aggregation framework for fast deep video inpainting. In particular, we construct an encoder-decoder model, where the encoder takes multiple reference frames which can provide visible pixels revealed from the scene dynamics. These hints are aggregated and fed into the decoder. We apply a recurrent feedback in an auto-regressive manner to enforce temporal consistency in the video results. We propose two architectural designs based on this framework. Our first model is a blind video decaptioning network (BVDNet) that is designed to automatically remove and inpaint text overlays in videos without any mask information. Our BVDNet wins the first place in the ECCV Chalearn 2018 LAP Inpainting Competition Track 2: Video Decaptioning. Second, we propose a network for more general video inpainting (VINet) to deal with more arbitrary and larger holes. Video results demonstrate the advantage of our framework compared to state-of-the-art methods both qualitatively and quantitatively. The codes are available at https://github.com/mcahny/Deep-Video-Inpainting, and https://github.com/shwoo93/video_decaptioning.
机译:视频污染旨在填充具有合理内容的视频中的时空孔。尽管基于深度学习的初始图像的巨大进展,但由于额外的时间尺寸将这些方法扩展到视频域中仍然具有挑战性。在本文中,我们提出了一种用于快速深度视频染色的经常性时间聚合框架。特别地,我们构造了一个编码器解码器模型,其中编码器采用多个参考帧,该参考帧可以提供从场景动态显示的可见像素。这些提示被聚合并进入解码器。我们以自动回归方式应用复发反馈,以强制视频结果中的时间一致性。我们提出了基于此框架的两种建筑设计。我们的第一款模型是盲目视频兼容网络(BVDNet),旨在自动删除和在没有任何掩码信息的视频中覆盖和inpaint文本。我们的BVDNet赢得了ECCV Chalearn 2018 Lap Inpainting竞赛赛道2:视频剥离。其次,我们提出了一个网络,以便更广泛的视频浸染(VINET)来处理更多的任意和更大的孔。视频结果展示了我们的框架的优势与定性和定量的最先进的方法相比。该代码可在https://github.com/mcahny/dep-video-inpainting和https://github.com/shwoo93/video_decaptioning。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号