首页> 外文期刊>Journal of Real-Time Image Processing >Buffer structure optimized VLSI architecture for efficient hierarchical integer pixel motion estimation implementation
【24h】

Buffer structure optimized VLSI architecture for efficient hierarchical integer pixel motion estimation implementation

机译:缓冲区结构优化的VLSI架构,可实现高效的分层整数像素运动估计实现

获取原文
获取原文并翻译 | 示例
           

摘要

Integer pixel motion estimation (IME) is one crucial module with high complexity in high-definition video encoder. Efficient algorithm and architecture joint design is supposed to tradeoff multiple target parameters including throughput capacity, logic gate, on-chip SRAM size, memory bandwidth, and rate distortion performance. Data organization and on-chip buffer structure are crucial factors for IME architecture design, accounting for multiple target performance tradeoff. In this work, we combine global hierarchical search and local full search to propose hardware efficient IME algorithm, and then propose hardware VLSI architecture with optimized on-chip buffer structure. The major contribution of this work is characterized by: (1) improved hierarchical IME algorithm with presearch and deliberate data organization, (2) multistage on-chip reference pixel buffer structure with high data reuse between integer and fraction pixel motion estimations, (3) highly reused and reconfigurable processing element structure. The optimized data organization and buffer structure achieves nearly 70 % buffer saving with less than average 0.08, 0.12 dB the worst case, PSNR degradation compared with full search based architecture. At the hardware cost of 336 and 382 K logic gate and 20 kB SRAM, the proposed architecture achieves the throughput of 384 and 272 cycles per macroblock, at system frequency of 95 and 264 MHz for 1080p and QFHD @30fps format video coding.
机译:整数像素运动估计(IME)是高清视频编码器中具有高度复杂性的重要模块之一。高效的算法和架构联合设计应该在多个目标参数之间进行权衡,包括吞吐能力,逻辑门,片上SRAM大小,存储器带宽和速率失真性能。数据组织和片上缓冲区结构是IME体系结构设计的关键因素,这说明了多个目标性能之间的权衡。在这项工作中,我们结合了全局分层搜索和局部全搜索来提出硬件有效的IME算法,然后提出具有优化的片上缓冲器结构的硬件VLSI体系结构。这项工作的主要贡献在于:(1)改进的具有预搜索和故意数据组织的IME算法;(2)多级片上参考像素缓冲器结构,在整数和分数像素运动估计之间具有高数据重用性;(3)高度可重用和可重配置的处理元素结构。经过优化的数据组织和缓冲区结构可实现近70%的缓冲区节省,与平均值0.08相比,不到平均水平,最坏情况下为0.12 dB,与基于完全搜索的架构相比,PSNR下降。以336和382 K逻辑门的硬件成本以及20 kB SRAM的硬件成本,在1080p和QFHD @ 30fps格式视频编码的系统频率分别为95和264 MHz时,所提出的体系结构可实现每个宏块384和272个周期的吞吐量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号