首页> 外文期刊>IEEE Transactions on Circuits and Systems for Video Technology >Analysis, fast algorithm, and VLSI architecture design for H.264/AVC intra frame coder
【24h】

Analysis, fast algorithm, and VLSI architecture design for H.264/AVC intra frame coder

机译:H.264 / AVC帧内编码器的分析,快速算法和VLSI架构设计

获取原文
获取原文并翻译 | 示例

摘要

Intra prediction with rate-distortion constrained mode decision is the most important technology in H.264/AVC intra frame coder, which is competitive with the latest image coding standard JPEG2000, in terms of both coding performance and computational complexity. The predictor generation engine for intra prediction and the transform engine for mode decision are critical because the operations require a lot of memory access and occupy 80% of the computation time of the entire intra compression process. A low cost general purpose processor cannot process these operations in real time. In this paper, we proposed two solutions for platform-based design of H.264/AVC intra frame coder. One solution is a software implementation targeted at low-end applications. Context-based decimation of unlikely candidates, subsampling of matching operations, bit-width truncation to reduce the computations, and interleaved full-search/partial-search strategy to stop the error propagation and to maintain the image quality, are proposed and combined as our fast algorithm. Experimental results show that our method can reduce 60% of the computation used for intra prediction and mode decision while keeping the peak signal-to-noise ratio degradation less than 0.3 dB. The other solution is a hardware accelerator targeted at high-end applications. After comprehensive analysis of instructions and exploration of parallelism, we proposed our system architecture with four-parallel intra prediction and mode decision to enhance the processing capability. Hadamard-based mode decision is modified as discrete cosine transform-based version to reduce 40% of memory access. Two-stage macroblock pipelining is also proposed to double the processing speed and hardware utilization. The other features of our design are reconfigurable predictor generator supporting all of the 13 intra prediction modes, parallel multitransform and inverse transform engine, and CAVLC bitstream engine. A prototype chip is fabricated with TSMC 0.25-/spl mu/m CMOS 1P5M technology. Simulation results show that our implementation can process 16 mega-pixels (4096/spl times/4096) within 1 s, or namely 720/spl times/480 4:2:0 30 Hz video in real time, at the operating frequency of 54 MHz. The transistor count is 429 K, and the core -size is only 1.855/spl times/1.885 mm/sup 2/.
机译:具有速率失真约束模式决策的帧内预测是H.264 / AVC帧内编码器中最重要的技术,就编码性能和计算复杂度而言,它与最新的图像编码标准JPEG2000竞争。用于帧内预测的预测器生成引擎和用于模式决策的变换引擎至关重要,因为这些操作需要大量内存访问,并占用整个帧内压缩过程的80%的计算时间。低成本通用处理器无法实时处理这些操作。在本文中,我们为基于H.264 / AVC帧内编码器的平台设计提出了两种解决方案。一种解决方案是针对低端应用程序的软件实现。提出并结合了基于上下文的不太可能的候选对象抽取,匹配操作的子采样,位宽截断以减少计算量以及交错的全搜索/部分搜索策略以停止错误传播并保持图像质量,快速算法。实验结果表明,我们的方法可以减少60%的帧内预测和模式决策计算量,同时使峰值信噪比降级小于0.3 dB。另一种解决方案是针对高端应用程序的硬件加速器。经过对指令的综合分析和对并行性的探索,我们提出了具有四并行帧内预测和模式决策的系统架构,以增强处理能力。基于Hadamard的模式决策已修改为基于离散余弦变换的版本,以减少40%的内存访问。还提出了两阶段宏块流水线处理,以使处理速度和硬件利用率提高一倍。我们设计的其他功能是可重构的预测器生成器,它支持所有13种帧内预测模式,并行多变换和逆变换引擎以及CAVLC比特流引擎。原型芯片是采用TSMC 0.25- / spl mu / m CMOS 1P5M技术制造的。仿真结果表明,我们的实现可以在1 s内处理16兆像素(4096 / spl次/ 4096),即工作频率为54时实时处理720 / spl次/ 480 4:2:0 30 Hz视频兆赫晶体管数为429 K,核心尺寸仅为1.855 / spl乘以1.885 mm / sup 2 /。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号