首页> 外文期刊>IEEE Transactions on Circuits and Systems for Video Technology >Deep Network-Based Frame Extrapolation With Reference Frame Alignment
【24h】

Deep Network-Based Frame Extrapolation With Reference Frame Alignment

机译:基于深度网络的帧外推,参考帧对齐

获取原文
获取原文并翻译 | 示例
           

摘要

Frame extrapolation is to predict future frames from the past (reference) frames, which has been studied intensively in the computer vision research and has great potential in video coding. Recently, a number of studies have been devoted to the use of deep networks for frame extrapolation, which achieves certain success. However, due to the complex and diverse motion patterns in natural video, it is still difficult to extrapolate frames with high fidelity directly from reference frames. To address this problem, we introduce reference frame alignment as a key technique for deep network-based frame extrapolation. We propose to align the reference frames, e.g. using block-based motion estimation and motion compensation, and then to extrapolate from the aligned frames by a trained deep network. Since the alignment, a preprocessing step, effectively reduces the diversity of network input, we observe that the network is easier to train and the extrapolated frames are of higher quality. We verify the proposed technique in video coding, using the extrapolated frame for inter prediction in High Efficiency Video Coding (HEVC) and Versatile Video Coding (VVC). We investigate different schemes, including whether to align between the target frame and the reference frames, and whether to perform motion estimation on the extrapolated frame. We conduct a comprehensive set of experiments to study the efficiency of the proposed method and to compare different schemes. Experimental results show that our proposal achieves on average 5.3% and 2.8% BD-rate reduction in Y component compared to HEVC, under low-delay P and low-delay B configurations, respectively. Our proposal performs much better than the frame extrapolation without reference frame alignment.
机译:帧外推是预测过去(参考)帧的未来帧,该帧已经在计算机视觉研究中深入研究,并且在视频编码中具有很大的潜力。最近,许多研究已经致力于利用深网络用于框架推断,这实现了某些成功。然而,由于自然视频中的复杂和多样化的运动模式,仍然难以从参考框架中直接与高保真的框架外推。为了解决这个问题,我们将参考帧对齐作为基于深网络的帧外推的关键技术。我们建议对准参考框架,例如,使用基于块的运动估计和运动补偿,然后通过训练的深网络从对准帧外推。由于对准,预处理步骤,有效地减少了网络输入的分集,我们观察到网络更容易训练,并且外推帧质量更高。我们使用用于在高效视频编码(HEVC)和多功能视频编码(VVC)中的用于帧间预测的外推帧来验证视频编码中所提出的技术。我们调查不同的方案,包括是否在目标帧和参考帧之间对齐,以及是否对推断帧执行运动估计。我们开展一套全面的实验,以研究提出的方法的效率并比较不同的方案。实验结果表明,与低延迟P和低延延迟B配置,我们的提案平均平均为5.3%和2.8%的BD速率降低y组分,与HEVC相比。我们的提议在没有参考帧对齐的情况下比帧外推性能更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号