首页> 外文期刊>IEEE Transactions on Circuits and Systems for Video Technology >Multimedia processor-based implementation of an error-diffusionhalftoning algorithm exploiting subword parallelism
【24h】

Multimedia processor-based implementation of an error-diffusionhalftoning algorithm exploiting subword parallelism

机译:利用子词并行性的基于多媒体处理器的错误扩散半色调算法的实现

获取原文
获取原文并翻译 | 示例

摘要

Multimedia processor-based implementations of digital image processing algorithms have become important since several multimedia processors are now available and can replace special-purpose hardware-based systems because of their flexibility. Multimedia processors increase throughput by processing multiple pixels simultaneously using a subword-parallel arithmetic and logic unit architecture. The error-diffusion halftoning algorithm employs feedback of quantized output signals to faithfully convert a multi-level image to a binary image or to one with fewer levels of quantization. This makes it difficult to achieve speedup by utilizing the multimedia extension. In this study, the error-diffusion halftoning algorithm is implemented for a multimedia processor using three methods: single-pixel, single-line, and multiple-line processing. The single-pixel approach is the closest to conventional implementations, but the multimedia extension is used only in the filter kernel. The single-line approach computes multiple pixels in one scan-line simultaneously, but requires a complex algorithm transformation to remove dependencies between pixels. The multiple-line method exploits parallelism by employing a skewed data structure and processing multiple pixels in different scan-lines. The Pentium MMX instruction set is used for quantitative performance evaluation including run-time overheads and misaligned memory accesses. A speedup of more than ten times is achieved compared to the software (integer C) implementation on a conventional processor for the structurally sequential error-diffusion halftoning algorithm
机译:由于现在已经有几种多媒体处理器可用,并且由于它们的灵活性可以替代基于专用硬件的系统,因此基于多媒体处理器的数字图像处理算法实现变得非常重要。多媒体处理器通过使用子字并行算术和逻辑单元架构同时处理多个像素来提高吞吐量。误差扩散半色调算法利用量化输出信号的反馈将多级图像忠实地转换为二进制图像或量化级别较少的图像。这使得难以通过利用多媒体扩展来实现加速。在这项研究中,使用三种方法为多媒体处理器实现了误差扩散半色调算法:单像素,单行和多行处理。单像素方法最接近常规实现,但是多媒体扩展仅在过滤器内核中使用。单行方法可以同时计算一条扫描线中的多个像素,但是需要进行复杂的算法转换才能消除像素之间的依赖性。多行方法通过采用倾斜的数据结构并在不同的扫描行中处理多个像素来利用并行性。 Pentium MMX指令集用于定量性能评估,包括运行时开销和未对齐的内存访问。与结构上顺序的误差扩散半色调算法的常规处理器上的软件(整数C)实现相比,实现了十倍以上的加速

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号