首页> 外文会议>High Performance Computing Symposium >Porting the sisal functional language to the EM-X distributed-memory multiprocessor
【24h】

Porting the sisal functional language to the EM-X distributed-memory multiprocessor

机译:将Sisal功能语言移植到EM-X分布式内存多处理器

获取原文

摘要

Distributed-memory multiprocessors have been regarded as a viable architecture of scalable and economical design for building large scale parallel machines. While these parallel machines can provide computational capabilities, programming such large-scale machines is often very difficult due to many practical issues including parallelization, data distribution, workload distribution, and remote memory latency. This report aims to solve the programmability and performance issues of distributed-memory machines using the Sisal functional language. The programs written in Sisal will be automatically parallelized, scheduled and run on the EM-X distributed-memory multiprocessor with no programmer intervention. Specifically, the proposed approach consists of the following steps. Given a program written in Sisal, the front end Sisal compiler generates a directed acyclic graph(DAG) to expose parallelism in the program. The DAG is partitioned and scheduled based on loop parallelism. The scheduled DAG is then translated to C programs with machine specific parallel constructs. The parallel C programs are finally compiled by the target machine specific compiler to generate an executable. Bitonic sorting and FFT problems are selected for experiments. Experimental results indicate that automatic parallelization of the Sisal programs based on loop parallelism is effective, giving 17 fold speedup for bitonic sorting and 60-fold speedup for FFT on 64 processors.
机译:分布式内存多处理器被认为是用于构建大规模并联机器的可扩展和经济设计的可行结构。虽然这些并联机器可以提供计算能力,但由于许多实际问题包括并行化,数据分布,工作负载分布和远程内存延迟,因此编程这些大型机器通常非常困难。本报告旨在使用Sisal功能语言解决分布式内存机器的可编程性和性能问题。用Sisal编写的程序将自动并行化,计划和运行在EM-X分布式内存多处理器上,没有程序员干预。具体地,所提出的方法包括以下步骤。鉴于在Sisal编写的程序,前端Sisal编译器生成指向的非循环图(DAG),以暴露程序中的并行性。基于循环并行性进行分区和调度DAG。然后将计划的DAG翻译成具有机器特定并行构造的C程序。终于由目标机器特定编译器编译并行C程序以生成可执行文件。选择BITONIC分类和FFT问题进行实验。实验结果表明,基于环路并行性的SISAL程序自动并行化是有效的,在64个处理器上为17倍加速和60倍的FFT加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号