首页> 外国专利> COMPUTE OPTIMIZATIONS FOR LOW PRECISION MACHINE LEARNING OPERATIONS

COMPUTE OPTIMIZATIONS FOR LOW PRECISION MACHINE LEARNING OPERATIONS

机译:计算低精密机器学习操作的优化

摘要

The present disclosure provides an interconnect fabric comprising one or more switches, a memory interface coupled to the interconnect fabric, an input/output (IO) interface coupled to the interconnect fabric and an array of processing clusters coupled to the interconnect fabric. The array of multiprocessors is to process mixed-precision instructions. At least one processing cluster comprises a plurality of registers to store a plurality of packed data elements at a first precision and an execution unit to execute mixed-precision dot-product instructions. The execution unit is to perform a plurality of multiplications of different pairs of the plurality of packed data elements to generate a corresponding plurality of products and to add the corresponding plurality of products to an accumulation value stored at a second precision greater than the first precision.
机译:本公开提供了一种互连结构,包括一个或多个开关,耦合到互连结构的存储器接口,耦合到互连结构的输入/输出(IO)接口和耦合到互连织物的处理簇阵列。多处理器阵列是处理混合精度指令。至少一个处理簇包括多个寄存器,用于以第一精度和执行单元存储多个包装数据元素以执行混合精度点产品指令。执行单元是执行多个包装数据元件的不同对的多个乘法以生成相应的多个产品,并将相应的多个产品添加到以大于第一精度的第二精度存储的累积值。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号