首页> 外文期刊>Journal of VLSI signal processing systems >Memory Performance Optimizations for Real-Time Software HDTV Decoding
【24h】

Memory Performance Optimizations for Real-Time Software HDTV Decoding

机译:实时软件HDTV解码的内存性能优化

获取原文
获取原文并翻译 | 示例

摘要

Pure software HDTV video decoding is still a challenging task on entry-level to mid-range desktop and notebook PCs, even with today's microprocessors frequency measured in GHz. This paper shows that the performance bottleneck in a software MPEG-2 decoder has been shifted to memory operations, as microprocessor technologies including multimedia instruction extensions have been improving at a fast rate during the past years. Our study exploits concurrencies at macroblock level to alleviate the performance bottleneck in a software MPEG-2 decoder. First, the paper introduces an interleaved block-order data layout to improve CPU cache performance. Second, the paper describes an algorithm to explicitly prefetch macroblocks for motion compensation. Finally, the paper presents an algorithm to schedule interleaved decoding and output at macroblock level. Our implementation and experiments show that these methods can effectively hide the latency of memory and frame buffer. The optimizations improve the performance of a multimedia-instruction-optimized software MPEG-2 decoder by a factor of about two. On a PC with a 933 MHz Pentium III CPU, the decoder can decode and display 1280 x 720-resolution HDTV streams at over 62 frames per second.
机译:在当今的微处理器频率为GHz的情况下,纯软件HDTV视频解码对于入门级到中档台式机和笔记本电脑仍然是一项艰巨的任务。本文表明,在过去的几年中,随着包括多媒体指令扩展在内的微处理器技术的飞速发展,软件MPEG-2解码器的性能瓶颈已转移到存储器操作上。我们的研究利用宏块级别的并发来减轻软件MPEG-2解码器的性能瓶颈。首先,本文介绍了交错的块顺序数据布局,以提高CPU缓存性能。其次,本文描述了一种显式预取宏块以进行运动补偿的算法。最后,本文提出了一种在宏块级调度交错解码和输出的算法。我们的实现和实验表明,这些方法可以有效地隐藏内存和帧缓冲区的延迟。该优化将多媒体指令优化的软件MPEG-2解码器的性能提高了大约两倍。在具有933 MHz Pentium III CPU的PC上,解码器可以以每秒超过62帧的速度解码和显示1280 x 720分辨率的HDTV流。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号