...
首页> 外文期刊>電子情報通信学会技術研究報告. 集積回路. Integrated Circuits and Devices >A parallelizing compiler in a hardware/software cosynthesis system for image/video processor with packed SIMD type instruction sets
【24h】

A parallelizing compiler in a hardware/software cosynthesis system for image/video processor with packed SIMD type instruction sets

机译:用于图像/视频处理器的硬件/软件烯库系统中的并行编译器,具有包装的SIMD类型指令集

获取原文
获取原文并翻译 | 示例
           

摘要

Many current general purpose processors and digital signal processors have extended instructions to enhance their performance of image/video processing applications. The extended functionality comes primarily with the addition of packed SIMD type instructions. These instructions aim at exploiting subword parallelism. The packed SIMD type instruction set includes hundreds of instructions but a small subset of them is enough to implement most image/video processing applications. Thus we can significantly reduce area of a processor within a restriction of execution time if application-specific synthesis is applied to it. In this paper, we propose a hardware/software cosynthesis system for processors with packed SIMD type instruction set and an algorithm of SIMD parallelization in a register for its compiler. The input of the system is an application description written in C and application data, and the output is hardware descriptions of a synthesized processor core, an application binary code executed on the processor core and software environment. Its compiler generates an object code assuming a processor core with all the available hardware units. It exploits instruction level and subword level parallelism, and attempts to minimize its execution time. The experimental results show the effectiveness of the compiler.
机译:许多当前通用处理器和数字信号处理器具有扩展指令,以增强其图像/视频处理应用的性能。扩展功能主要通过添加包装的SIMD类型指令来实现。这些指令旨在利用子字并行性。包装的SIMD型指令集包括数百个指令,但它们的小​​子集足以实现大多数图像/视频处理应用程序。因此,如果应用特定于应用特定的合成应用,我们可以显着减小处理器的限制内的区域。在本文中,我们提出了一种用于具有包装SIMD型指令集的处理器的硬件/软件烯库系统和其编译器的寄存器中的SIMD并行化算法。系统的输入是用C和应用程序数据写入的应用程序描述,输出是合成处理器内核的硬件描述,在处理器核心和软件环境中执行的应用程序二进制代码。它的编译器假设具有所有可用硬件单元的处理器核心的对象代码。它利用指令级别和子字级并行度,并尝试最小化其执行时间。实验结果表明了编译器的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号