首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Accelerating Sparse Matrix Operations in Neural Networks on Graphics Processing Units
【24h】

Accelerating Sparse Matrix Operations in Neural Networks on Graphics Processing Units

机译:在图形处理单元上的神经网络中加速稀疏矩阵运算

获取原文

摘要

Graphics Processing Units (GPUs) are commonly used to train and evaluate neural networks efficiently. While previous work in deep learning has focused on accelerating operations on dense matrices/tensors on GPUs, efforts have concentrated on operations involving sparse data structures. Operations using sparse structures are common in natural language models at the input and output layers, because these models operate on sequences over discrete alphabets. We present two new GPU algorithms: one at the input layer, for multiplying a matrix by a few-hot vector (generalizing the more common operation of multiplication by a one-hot vector) and one at the output layer, for a fused softmax and top-N selection (commonly used in beam search). Our methods achieve speedups over state-of-the-art parallel GPU baselines of up to 7x and 50x, respectively. We also illustrate how our methods scale on different GPU architectures.
机译:图形处理单元(GPU)通常用于有效地训练和评估神经网络。虽然以前的深度学习工作专注于加速GPU上的密集矩阵/张量上的运算,但工作重点却集中在涉及稀疏数据结构的运算上。使用稀疏结构的操作在输入和输出层的自然语言模型中很常见,因为这些模型对离散字母上的序列进行操作。我们提出了两种新的GPU算法:一种在输入层,用于将矩阵乘以几个热向量(将一个热向量乘以更常见的乘法运算),另一种在输出层,用于融合softmax和top-N选择(通常在波束搜索中使用)。我们的方法在最先进的并行GPU基准上分别实现了高达7倍和50倍的加速。我们还将说明我们的方法如何在不同的GPU架构上扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号