...
首页> 外文期刊>Information Technology Journal >Hardware Implementation of Instruction Level Parallel Architecture Incorporating Special Functional Units for Image Processing Algorithms
【24h】

Hardware Implementation of Instruction Level Parallel Architecture Incorporating Special Functional Units for Image Processing Algorithms

机译:包含特殊功能单元的图像处理算法的指令级并行体系结构的硬件实现

获取原文
           

摘要

Parallel processing is an efficient form of information processing with emphasis on the exploitation concurrent events in computations. Considering a sequence of assembly instructions for a specific problem it is found that many of the consecutive instructions are independent of each other, without any data dependencies between them. This work exploits such situations and it executes pairs of instructions, which do not have dependencies between them, on two different processing elements, thus enhancing the speed of operations. It is not always true that any two instructions taken from a sequence of instructions could go in parallel. The various types of dependencies that exist among the instructions are the bottleneck in executing instruction in parallel. The various possible data dependencies and control transfers are handled so that most of the instructions are run pairs. The ILP(Instruction Level Parallelism) architecture designed here is to be used for image processing applications. Since specific hardware solutions are always faster that their software counterparts and we have dedicated hardware units for most frequently used image processing problems of finding DFT and DCT. The proposed architecture improves the performance with a speed up factor of more than 1.5 with lesser data dependencies, we can get a higher speed up factor, upper bounded by the value of 2 by the Amdahl`s law.
机译:并行处理是信息处理的一种有效形式,重点是在计算中利用并发事件。考虑到针对特定问题的一系列汇编指令,可以发现许多连续指令彼此独立,而它们之间没有任何数据相关性。这项工作利用了这种情况,并在两个不同的处理元素上执行了成对的指令,它们之间没有依赖性,从而提高了操作速度。从指令序列中提取的任何两条指令并不总是可以并行进行的。指令之间存在的各种依赖关系是并行执行指令的瓶颈。处理各种可能的数据依赖性和控制传递,以便大多数指令是运行对。这里设计的ILP(指令级并行)体系结构将用于图像处理应用程序。由于特定的硬件解决方案总是比其软件同类产品更快,因此我们拥有专用的硬件单元来解决最常见的查找DFT和DCT的图像处理问题。所提出的体系结构以大于1.5的加速因子提高了性能,同时具有较小的数据依存关系,我们可以得到更高的加速因子,根据阿姆达尔定律以2为上限。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号