首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >Attention-Based Neural Networks for Chroma Intra Prediction in Video Coding
【24h】

Attention-Based Neural Networks for Chroma Intra Prediction in Video Coding

机译:基于关注的视频编码中的色度帧内预测的神经网络

获取原文
获取原文并翻译 | 示例

摘要

Neural networks can be successfully used to improve several modules of advanced video coding schemes. In particular, compression of colour components was shown to greatly benefit from usage of machine learning models, thanks to the design of appropriate attention-based architectures that allow the prediction to exploit specific samples in the reference region. However, such architectures tend to be complex and computationally intense, and may be difficult to deploy in a practical video coding pipeline. This work focuses on reducing the complexity of such methodologies, to design a set of simplified and cost-effective attention-based architectures for chroma intra-prediction. A novel size-agnostic multi-model approach is proposed to reduce the complexity of the inference process. The resulting simplified architecture is still capable of outperforming state-of-the-art methods. Moreover, a collection of simplifications is presented in this paper, to further reduce the complexity overhead of the proposed prediction architecture. Thanks to these simplifications, a reduction in the number of parameters of around 90% is achieved with respect to the original attention-based methodologies. Simplifications include a framework for reducing the overhead of the convolutional operations, a simplified cross-component processing model integrated into the original architecture, and a methodology to perform integer-precision approximations with the aim to obtain fast and hardware-aware implementations. The proposed schemes are integrated into the Versatile Video Coding (VVC) prediction pipeline, retaining compression efficiency of state-of-the-art chroma intra-prediction methods based on neural networks, while offering different directions for significantly reducing coding complexity.
机译:神经网络可以成功地用于改进高级视频编码方案的多个模块。特别地,由于允许预测在参考区域中的特定样本来实现适当的关注的架构,因此显示了颜色组件的压缩从机器学习模型的使用大大受益匪浅。然而,这种架构往往是复杂的并且计算地强烈,并且可能难以在实际的视频编码管道中部署。这项工作侧重于降低这些方法的复杂性,设计一套用于色度帧内预测的简化和经济高效的关注架构。提出了一种新型尺寸 - 不可知的多模型方法,以降低推理过程的复杂性。由此产生的简化架构仍然能够优于最先进的方法。此外,本文提出了一系列简化,以进一步降低所提出的预测架构的复杂性开销。由于这些简化,相对于原始关注的方法实现了约90%的参数数量的减少。简化包括用于减少卷积操作的开销的框架,这是集成到原始架构中的简化交叉组件处理模型,以及一种执行整数精度近似的方法,其目的是获得快速和硬件感知实现。该提出的方案被集成到通用视频编码(VVC)预测管道中,基于神经网络保持最先进的色度内预测方法的压缩效率,同时提供不同的方向,以显着降低编码复杂性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号