首页> 外文期刊>IEE proceedings. Part G, Circuits, devices and systems >Efficient technique for partitioning and programming linear algebraalgorithms on concurrent VLSI architectures
【24h】

Efficient technique for partitioning and programming linear algebraalgorithms on concurrent VLSI architectures

机译:在并行VLSI架构上对线性代数算法进行分区和编程的有效技术

获取原文
获取原文并翻译 | 示例
           

摘要

An efficient technique for partitioning and programming linearnalgebra algorithms on concurrent architectures is described and appliednto 2-D wavefront arrays. The mapping of the computational elementsn(processes) to processors is based on the concept of folding. Thenmapping pattern on the 2-D full-size mesh of processes is composition ofnsymmetric tiles of size 2√(N)×2√(N), N being thennumber of processors. The algorithm can be partitioned according to anglobally sequential, locally parallel scheme. The code optimisation isnperformed by programming a few different types of tile, according to thenalgorithm. When the size of the problem is much larger than the size ofnthe mesh of processors, a linear speed-up is achieved independently ofnthe number of processors. Experimental results are presented for matrixnmultiplication, LU decomposition and the solution of triangular systemnequations on 2-D meshes of transputers programmed in Occam
机译:描述了一种在并行架构上对线性代数算法进行分区和编程的有效技术,并将其应用于二维波前阵列。计算元素n(过程)到处理器的映射基于折叠的概念。二维全尺寸网格上的映射模式是大小为2√(N)×2√(N)的不对称图块的组合,则N为处理器数。可以根据全局顺序,局部并行方案对算法进行分区。根据算法,通过对几种不同类型的图块进行编程来执行代码优化。当问题的大小远大于处理器网格的大小时,可以独立于处理器数量而实现线性加速。给出了在Occam中编程的晶片机二维网格上的矩阵乘法,LU分解和三角系统方程解的实验结果

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号