AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python

Cristian Ramon-Cortes; Ramon Amela; Jorge Ejarque; Philippe Clauss; Rosa M. Badia

首页> 外文期刊>International Journal of High Performance Computing Applications >AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python

【24h】

AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python

机译：AutoParelial：Python中的自动平行和分布式执行仿射循环嵌套

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The last improvements in programming languages and models have focused on simplicity and abstraction; leading Python to the top of the list of the programming languages. However, there is still room for improvement when preventing users from dealing directly with distributed and parallel computing issues. This paper proposes and evaluates AutoParallel, a Python module to automatically find an appropriate task-based parallelisation of affine loop nests and execute them in parallel in a distributed computing infrastructure. It is based on sequential programming and contains one single annotation (in the form of a Python decorator) so that anyone with intermediate-level programming skills can scale up an application to hundreds of cores.The evaluation demonstrates that AutoParallel goes one step further in easing the development of distributed applications. On the one hand, the programmability evaluation highlights the benefits of using a single Python decorator instead of manually annotating each task and its parameters or, even worse, having to develop the parallel code explicitly (e.g., using OpenMP, MPI). On the other hand, the performance evaluation demonstrates that AutoParallel is capable of automatically generating task-based workflows from sequential Python code while achieving the same performances than manually taskified versions of established state-of-the-art algorithms (i.e., Cholesky, LU, and QR decompositions). Finally, AutoParallel is also capable of automatically building data blocks to increase the tasks’ granularity; freeing the user from creating the data chunks, and re-designing the algorithm. For advanced users, we believe that this feature can be useful as a baseline to design blocked algorithms.

机译：编程语言和模型的最后改进专注于简单和抽象;将Python领先于编程语言列表的顶部。但是，当防止用户直接处理分布式和并行计算问题时，仍有改进的余地。本文提出并评估了Autoparal，一个Python模块，自动找到了仿射循环嵌套的适当任务的并行性，并在分布式计算基础架构中并行执行它们。它基于顺序编程，包含一个单个注释（以蟒蛇装饰器的形式），因此具有中间级编程技能的任何人都可以扩展到数百个内核的应用。评估表明，Autoparelall平行进一步进一步进一步分布式应用的发展。一方面，可编程性评估突出了使用单个Python装饰器的好处，而不是手动注释每个任务及其参数，或者更糟糕的是，必须明确地开发并行代码（例如，使用OpenMP，MPI）。另一方面，性能评估表明，自动扩展能够从顺序Python代码自动生成基于任务的工作流程，同时实现与现有最先进的算法的手动任务版本相同的性能（即，Cholesky，Lu，和QR分解）。最后，自动扩展也能够自动构建数据块以增加任务的粒度;释放用户创建数据块并重新设计算法。对于高级用户，我们相信此功能可用作设计阻塞算法的基准。

著录项

来源
《International Journal of High Performance Computing Applications》 |2020年第6期|659-675|共17页
作者
Cristian Ramon-Cortes; Ramon Amela; Jorge Ejarque; Philippe Clauss; Rosa M. Badia;
展开▼
作者单位

Barcelona Supercomputing Center (BSC);

Barcelona Supercomputing Center (BSC);

Barcelona Supercomputing Center (BSC);

INRIA - ICube Lab – 538090 Université de Strasbourg;

Barcelona Supercomputing Center (BSC);

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Automatic parallelisation; distributed computing; programming models;

机译：自动平行化;分布式计算;编程模型;

相似文献

外文文献
中文文献
专利

1. Application execution path analysis for the automatic parallelisation of binary codes in the Intel x86 platform [J] . Andre M. Eberle, Rodrigo F. de Mello Parallel algorithms and applications . 2016,第5a6期

机译：英特尔x86平台中二进制代码自动并行化的应用程序执行路径分析
2. On loop transformations of nested loops with affine dependencies [J] . Andreas Popp, Karl-Heinz Zimmermann Computer physics communications . 2001,第1期

机译：具有仿射依赖关系的嵌套循环的On循环转换
3. On loop transformations of nested loops with affine dependencies [J] . Andreas Popp, Karl-Heinz Zimmermann Computer physics communications . 2001,第1期

机译：具有仿射依赖关系的嵌套循环的On循环转换
4. Eigenvectors-based parallelisation of nested loops with affine dependences [C] . Lenders, P., Jingling Xue . 1997

机译：具有仿射依赖关系的嵌套循环基于特征向量的并行化
5. Tools for performance optimizations and tuning of affine loop nests. [D] . Hartono, Albert. 2010

机译：用于性能优化和仿射循环嵌套调整的工具。
6. Integration of the Rosetta suite with the python software stack via reproducible packaging and core programming interfaces for distributed simulation [O] . Alexander S. Ford, Brian D. Weitzner, Christopher D. Bahl 2020

机译：通过可重复的包装和核心编程接口与Python软件堆栈的集成用于分布式模拟
7. Eigenvectors-Based Parallelisation of Nested Loops with Affine Dependences [O] . Patrick Lenders, Jingling Xue 2009

机译：基于特征向量的嵌套循环的并行化与仿射依赖性
8. Automatic Parallelisation of Scientific Application Codes Using a Computer Aided Parallelisation Toolkit [R] . Ierotheou, C. , Johnson, S. , Leggett, P. , 2001

机译：使用计算机辅助并行化工具包自动并行化科学应用程序代码

AutoParallel: Automatic parallelisation and distributed execution of affine loop nests in Python

摘要

著录项

相似文献

相关主题

期刊订阅