首页> 外文期刊>Parallel Computing >A Class Of Parallel Tiled Linear Algebra Algorithms For Multicore Architectures
【24h】

A Class Of Parallel Tiled Linear Algebra Algorithms For Multicore Architectures

机译:一类用于多核架构的并行平铺线性代数算法

获取原文
获取原文并翻译 | 示例

摘要

As multicore systems continue to gain ground in the high performance computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine grain parallelism becomes a major requirement and introduces the necessity of loose synchronization in the parallel execution of an operation. This paper presents algorithms for the Cholesky, LU and QR factorization where the operations can be represented as a sequence of small tasks that operate on square blocks of data. These tasks can be dynamically scheduled for execution based on the dependencies among them and on the availability of computational resources. This may result in out of order execution of tasks which will completely hide the presence of intrinsically sequential tasks in the factorization. Performance comparisons are presented with LAPACK algorithms where parallelism can only be exploited at the level of the BLAS operations and vendor implementations.
机译:随着多核系统在高性能计算领域的不断发展,必须重新制定线性代数算法或开发新算法,以利用这些新处理器的架构功能。细粒度并行化成为主要要求,并在并行执行操作中引入了松散同步的必要性。本文介绍了用于Cholesky,LU和QR因式分解的算法,其中的操作可以表示为对数据的方形块进行操作的一系列小任务。这些任务可以根据它们之间的依赖关系以及计算资源的可用性而动态地调度为执行。这可能导致任务的无序执行,这将完全隐藏因式分解中固有顺序任务的存在。使用LAPACK算法进行了性能比较,其中并行性只能在BLAS操作和供应商实现的级别上利用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号