首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Accelerating Sparse Matrix Operations in Neural Networks on Graphics Processing Units
【24h】

Accelerating Sparse Matrix Operations in Neural Networks on Graphics Processing Units

机译:在图形处理单元上加速神经网络中的稀疏矩阵操作

获取原文

摘要

Graphics Processing Units (GPUs) are commonly used to train and evaluate neural networks efficiently. While previous work in deep learning has focused on accelerating operations on dense matrices/tensors on GPUs, efforts have concentrated on operations involving sparse data structures. Operations using sparse structures are common in natural language models at the input and output layers, because these models operate on sequences over discrete alphabets. We present two new GPU algorithms: one at the input layer, for multiplying a matrix by a few-hot vector (generalizing the more common operation of multiplication by a one-hot vector) and one at the output layer, for a fused softmax and top-N selection (commonly used in beam search). Our methods achieve speedups over state-of-the-art parallel GPU baselines of up to 7x and 50x, respectively. We also illustrate how our methods scale on different GPU architectures.
机译:图形处理单元(GPU)通常用于有效地培训和评估神经网络。虽然以前在深度学习的工作侧重于加速在GPU上的密集矩阵/张量的操作,但努力集中在涉及稀疏数据结构的操作。使用稀疏结构的操作在输入和输出层的自然语言模型中是常见的,因为这些模型在离散字母表上序列进行操作。我们在输入层处呈现两个新的GPU算法,用于将矩阵乘以几个热的向量(通过一个热向量乘以一个热向量的乘法的更常见操作),并且在输出层中,用于融合软邮件和TOP-N选择(常用于波束搜索)。我们的方法分别实现了高达7x和50倍的最先进的平行GPU基线的加速。我们还说明了我们的方法如何在不同的GPU架构上规模。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号