...
首页> 外文期刊>ACM transactions on mathematical software >A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines
【24h】

A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines

机译:一组批次的基本线性代数子程序和Lapack例程

获取原文
获取原文并翻译 | 示例
           

摘要

This article describes a standard API for a set of Batched Basic Linear Algebra Subprograms (Batched BLAS or BBLAS). The focus is on many independent BLAS operations on small matrices that are grouped together and processed by a single routine. called a Batched BLAS routine. The matrices are grouped together in uniformly sized groups, with just one group if all the matrices are of equal size. The aim is to provide more efficient, but portable, implementations of algorithms on high-performance many-core platforms. These include multicore and many-core CPU processors, GPUs and coprocessors, and other hardware accelerators with floating-point compute facility. As well as the standard types of single and double precision, we also include half and quadruple precision in the standard. In particular, half precision is used in many very large scale applications, such as those associated with machine learning.
机译:本文介绍了一组批次的基本线性代数子程序(批量Blas或BBLA)的标准API。 重点是在许多独立的BLAS上对小矩阵的操作,该矩阵被分组在一起并由单个例程处理。 称为批次的Blas常规。 矩阵以均匀大小的组分组,只有一个组,如果所有矩阵相同的大小。 目的是提供更高效但便携的,在高性能多核平台上的算法的实现。 这些包括多核和许多核心CPU处理器,GPU和协处理器以及具有浮点计算设施的其他硬件加速器。 除了单一和双精度的标准类型,我们还包括标准中的一半和四点精度。 特别是,半精度用于许多非常大的尺度应用,例如与机器学习相关的应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号