PTR-CNN for in-loop filtering in video coding

Shao Tong; Liu Tianqi; Wu DapengTsai Chia-YangLei ZhijunKatsavounidis Ioannis

摘要

A deep learning method called PTR-CNN (Predicted frame with Transform unit partition and prediction Residual aided CNN) is proposed for in-loop filtering in video compression. To reduce the computational complexity of an end-to-end CNN in-loop filter, a non-learning method of reference frame selection is designed to select the highest quality frame based on the frame's blurriness and smoothiness scores. The transform unit (TU) partition and the prediction residual (PR) of the current frame are used as extra inputs to the neural network as the filtering guidance. The selected similar and high quality reference frame (RF) and the current unfiltered frame (CUF) are input to a CNN based motion compensation module to generate a predicted frame (PF). Finally input the PF, the CUF, the CUF's TU partition and the CUF's PR into the main CNN to reconstruct the filtered frame. The model is implemented in Tensorflow and tested in HEVC and AV1. Experimental results show that the complexity of proposed PTR-CNN is less than SOTA CNN-based reference aided in-loop filtering methods and slightly outperforms their RD performance. The scheme introduces a complexity overhead of 7 on the encoder. In particular, for random access, the proposed model achieves 11.78 coding gain over HEVC with DBF/SAO off, while has a gain of 4.76 over HEVC with DBF/SAO on. Ablation study demonstrates that the RF contributes about 10 of the total gain, and the TU and PR contribute over 4 of the total one, proving the effectiveness of each module. Moreover, it is observed that the proposed method can restore detailed structures and textures and hence improve the subjective quality.

机译：该文提出一种深度学习方法PTR-CNN（Predicted frame with Transform unit partition and prediction Residual aided CNN）用于视频压缩中的环内滤波。为了降低端到端CNN在环滤波器的计算复杂度，设计了一种参考帧选择的非学习方法，根据帧的模糊度和平滑度分数选择最高质量的帧。变换单元（TU）分区和当前帧的预测残差（PR）用作神经网络的额外输入，作为滤波指导。选定的相似和高质量参考帧（RF）和当前未滤波帧（CUF）被输入到基于 CNN 的运动补偿模块，以生成预测帧（PF）。最后将PF、CUF、CUF的TU分区和CUF的PR输入到主CNN中，重建滤波后的帧。该模型在 Tensorflow 中实现，并在 HEVC 和 AV1 中进行了测试。实验结果表明，所提PTR-CNN的复杂度低于基于SOTA CNN的参考辅助在环滤波方法，且略优于其RD性能。该方案在编码器上引入了 7% 的复杂度开销。特别是，对于随机存取，所提出的模型在关闭DBF/SAO的情况下比HEVC提高了11.78%的编码增益，而在打开DBF/SAO的情况下，该模型比HEVC提高了4.76%。消融研究表明，RF贡献了总增益的10%左右，TU和PR贡献了总增益的4%以上，证明了每个模块的有效性。此外，观察到所提方法可以还原细节结构和纹理，从而提高主观质量。

PTR-CNN for in-loop filtering in video coding

摘要

著录项

引文网络

相关主题

期刊订阅