首页> 外文会议>IEEE Symposium on Computer Arithmetic >Intel Nervana Neural Network Processor-T (NNP-T) Fused Floating Point Many-Term Dot Product

【24h】

Intel Nervana Neural Network Processor-T (NNP-T) Fused Floating Point Many-Term Dot Product

机译：英特尔神经网络神经网络处理器-T（NNP-T）融合浮点多点产品

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Intel’s Nervana Neural Network Processor for Training (NNP-T) contains at its core an advanced floating point dot product design to accelerate the matrix multiplication operations found in many AI applications. Each Matrix Processing Unit (MPU) on the Intel NNP-T can process a 32x32 BFloat16 matrix multiplication every 32 cycles, accumulating the result in single precision (FP32). To reduce hardware costs, the MPU uses a fused many-term floating point dot product design with block alignment of the many input terms during addition, resulting in a unique datapath with several interesting design trade-offs. In this paper, we describe the details of the MPU pipeline, discuss the trade-offs made in the design, and present information on the accuracy of the computation as compared to traditional FMA implementations.

机译：英特尔的神经培训神经网络处理器（NNP-T）的核心包含先进的浮点点产品设计，以加速许多AI应用程序中发现的矩阵乘法运算。英特尔NNP-T上的每个矩阵处理单元（MPU）可以每32个周期处理32x32 BFloat16矩阵乘法，从而以单精度（FP32）累积结果。为了降低硬件成本，MPU使用融合的长期浮点点产品设计，在添加过程中对许多输入项进行块对齐，从而产生具有一些有趣的设计折衷的唯一数据路径。在本文中，我们描述了MPU管道的细节，讨论了设计中的取舍，并提供了与传统FMA实现相比计算精度的信息。

著录项

来源
《IEEE Symposium on Computer Arithmetic》|2020年|133-136|共4页
会议地点
作者
Brian Hickmann; Jieasheng Chen; Michael Rotzin; Andrew Yang; Maciej Urbanski; Sasikanth Avancha;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Tensile stress; Machine learning; Adders; Artificial neural networks; Training; Hardware;

机译：拉伸应力;机器学习;加法器;人工神经网络;培训;硬件;

相似文献

外文文献
中文文献
专利

1. A Fused Floating-Point Four-Term Dot Product Unit [J] . J. Sohn, E. E. Swartzlander IEEE transactions on circuits and systems . I , Regular papers . 2016,第3期

机译：融合浮点四项点产品单元
2. A two-item floating point fused dot-product unit with latency reduced [J] . Mingjiang Wang, De Liu, Ming Liu, IEICE Electronics Express . 2016,第23期

机译：减少延迟的具有两个项目的浮点融合点积产品
3. FFT Implementation with Fused Floating Point Dot Product Using Radix-8 Butterfly [J] . Sharmila Hemanandh, A. Sivasubramanian Asian Journal of Information Technology . 2016,第12期

机译：使用Radix-8 Butterfly的融合浮点积的FFT实现
4. An Efficient Fused Floating-Point Dot Product Unit Using Vedic Mathematics [C] . Dasari Lakshmi Prasanna, E. Prabhu International Conference on Trends in Electronics and Informatics . 2019

机译：使用吠陀数学的高效熔融浮点产品单元
5. Deep convolutional neural networks for classification of fused hyperspectral and LiDAR data. [D] . Morchhale, Saurabh. 2016

机译：深度卷积神经网络，用于融合高光谱和LiDAR数据。
6. A novel method for the discrimination of Hawthorn and its processed products using an intelligent sensory system and artificial neural networks [O] . Da-Shuai Xie, Wei Peng, Jun-Cheng Chen, 2016

机译：利用智能传感系统和人工神经网络判别山楂及其加工产品的新方法
7. FLOATING-POINT BUTTERFLY ARCHITECTURE BASED ON REDUNDANT NUMBER SYSTEM AND FUSED-DOT-PRODUCT-ADD UNIT [O] . 2017

机译：基于冗余数字系统和融合点 - 产品 - 添加单元的浮点蝴蝶架构
8. Hybrid Intelligent Perception System: Intelligent perception through combining Artificial Neural Networks and an Expert System. [R] . Glover, C. W., Spelt, P. F. 1990

机译：混合智能感知系统：人工神经网络与专家系统相结合的智能感知系统。

Intel Nervana Neural Network Processor-T (NNP-T) Fused Floating Point Many-Term Dot Product

摘要

著录项

相似文献

相关主题

期刊订阅