Mozart: Efficient Composition of Library Functions for Heterogeneous Execution

机译：莫扎特：用于异构执行的图书馆函数的高效组成

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current processor trend is to couple a commodity processor with a GPU, a co-processor, or an accelerator. To unleash the full computational power of such heterogeneous systems is a daunting task: programmers often resort to heterogeneous scheduling runtime frameworks that use device specific library routines. However, highly-tuned libraries do not compose very well across heterogeneous architectures. That is, important performance-oriented optimizations such as data locality and reuse "across" library calls is not fully exploited. In this paper, we present a framework, called Mozart, to extend existing library frameworks to efficiently compose a sequence of library calls for heterogeneous execution. Mozart consists of two components: library description (LD) and library composition runtime. We advocate library writers to wrap existing libraries using LD in order to provide their performance parameters on heterogeneous cores, no programmer intervention is necessary. Our runtime performs composition of libraries via task-fission, load balances among heterogeneous cores using information from LD, and automatically adapts to runtime behavior of an application. We evaluate Mozart on a Xeon + 2 Xeon Phi system using the High Performance Linpack benchmark which is the most popular benchmark to rank supercomputers in TOP500 and show GFLOPS improvement of 31.7% over MKL with Automatic Offload and 6.7%) over hand-optimized ninja code.

机译：当前的处理器趋势是将商品处理器与GPU，协处理器或加速器耦合。为了释放这种异构系统的完整计算能力是一个令人生畏的任务：程序员经常诉诸使用设备特定库例程的异构调度运行时框架。但是，高调的库在异构架构上没有很好地撰写。也就是说，未充分利用“跨”库调用等重要的表演型优化和“跨越”库调用。在本文中，我们介绍了一个框架，称为Mozart，以扩展现有的库框架，以有效地撰写一系列库调用异构执行。莫扎特由两个组件组成：库描述（LD）和库组成运行时。我们提倡图书馆作家使用LD包装现有库，以便在异构核心上提供它们的性能参数，没有必要的程序员干预。我们的运行时通过任务裂变执行图书馆的组成，使用来自LD的信息的异构核心中的负载余额，并自动适应应用程序的运行时行为。我们使用高性能LINPACK基准在Xeon + 2 Xeon Phi系统上评估Mozart，该基准是最受欢迎的基准测试，可以在Top500中排名超级计算机，并显示GFLOPS改善31.7％，通过自动卸载和6.7％）通过手工优化的忍者代码。。

著录项

来源
《International Workshop on Languages and Compilers for Parallel Computing》|2019年|290p|共21页
会议地点
作者
Rajkishore Barik; Tatiana Shpeisman; Hongbo Rong; Chunling Hu; Victor W. Lee; Todd A. Anderson; Greg Henry; Hal Liu; Youfeng Wu; Paul Petersen; Geoff Lowney;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP314-53;
关键词
入库时间 2022-08-20 20:19:59

相似文献

外文文献
中文文献
专利

1. Using Reduced Execution Flow Graph to Identify Library Functions in Binary Code [J] . Qiu Jing, Su Xiaohong, Ma Peijun Software Engineering, IEEE Transactions on . 2016,第2期

机译：使用简化的执行流程图识别二进制代码中的库函数
2. Predictive Thermal Management for Energy-Efficient Execution of Concurrent Applications on Heterogeneous Multicores [J] . Wachter Eduardo Weber, de Bellefroid Cedric, Basireddy Karunakar Reddy, IEEE transactions on very large scale integration (VLSI) systems . 2019,第6期

机译：预测性热管理，用于异构多核上并发应用的节能执行
3. ECOS: An efficient task-clustering based cost-effective aware scheduling algorithm for scientific workflows execution on heterogeneous cloud systems [J] . Dong Minggang, Fan Lili, Jing Chao The Journal of Systems and Software . 2019,第Deca期

机译：ECOS：一种高效的基于任务聚类的具有成本效益的调度算法，用于在异构云系统上执行科学的工作流
4. Mozart: Efficient Composition of Library Functions for Heterogeneous Execution [C] . Rajkishore Barik, Tatiana Shpeisman, Hongbo Rong, International Workshop on Languages and Compilers for Parallel Computing . 2019

机译：莫扎特：用于异构执行的图书馆函数的高效组成
5. Efficient Fine-Grain Cooperative Execution of Dynamic Task Parallelism on Heterogeneous Multi/Manycore Systems [D] . Wang, Moyang. 2021

机译：异构多/多核系统动态任务并行性的高效微粒合作执行
6. Brain Networks Underlying Strategy Execution and Feedback Processing in an Efficient Functional Magnetic Resonance Imaging Neurofeedback Training Performed in a Parallel or a Serial Paradigm [O] . Wan Ilma Dewiputri, Renate Schweizer, Tibor Auer 2021

机译：大脑网络基础战略执行和反馈处理在一个平行或串行范式中执行的有效功能磁共振成像的神经融合训练
7. Code generation for energy‐efficient execution of dynamic streaming task graphs on parallel and heterogeneous platforms [O] . Sebastian Litzinger, Jörg Keller 2020

机译：用于在并行和异构平台上的动态流任务图的节能执行的代码生成

Mozart: Efficient Composition of Library Functions for Heterogeneous Execution

摘要

著录项

相似文献

相关主题

期刊订阅