首页> 外文期刊>Microprocessor report >TESLA D1 TACKLES AI TRAINING: Custom Design Combines 25 Chips Using Fanout Packaging
【24h】

TESLA D1 TACKLES AI TRAINING: Custom Design Combines 25 Chips Using Fanout Packaging

机译:Tesla D1 Tackles AI培训:定制设计结合了25个芯片使用扇出包装

获取原文
获取原文并翻译 | 示例
       

摘要

Elon Musk likes unorthodox approaches. For his latest surprise, Tesla has developed its own AI training chip, aiming to replace Nvidia GPUs in its massive data center. Its new "Dojo" system will train complex neural networks for its autonomous cars. The company's custom D1 accelerator yields industry-leading performance and I/O bandwidth. It implements a unique method to combine 25 chips in one package, delivering more than 9Pflop/s (9,000Tflop/s) of peak BF16 throughput. Each Dl chip has 354 custom CPU cores and employs hundreds of serdes lanes to achieve its industry-leading chip-to-chip bandwidth. To train its vast neural networks, Tesla must connect thousands of these chips. It packages them using TSMC's latest advance in integrated-fanout (InFo) technology, which employs a wafer-size substrate to support and connect more than two dozen chips, increasing compute density and simplifying deployment. TSMC packages and manufactures the design in its 7nm technology and has already delivered engineering samples, which run at up to 2.0GHz. The carmaker designated the product for internal use only.
机译:伊隆麝香喜欢非正统的方法。为他的最新惊喜,特斯拉开发了自己的AI训练芯片,旨在取代NVIDIA GPU在其大规模数据中心。它的新“Dojo”系统将为其自主汽车培训复杂的神经网络。该公司的定制D1加速器产生行业领先的性能和I / O带宽。它实现了一种独特的方法,可以在一个包中组合25个芯片,提供超过9pflop / s(9,000tFlop / s)的峰值BF16吞吐量。每个DL芯片都有354个自定义CPU核心,采用数百个Serdes车道来实现其行业领先的芯片到片带宽。为了训练其庞大的神经网络,特斯拉必须连接数千个这些芯片。它使用TSMC最新的集成扇址(INFO)技术进行了包装,该技术采用晶片尺寸基板来支持并连接超过两次芯片,增加计算密度和简化部署。 TSMC封装并在7NM技术中制造设计,已经提供了工程样本,该样本最高可达2.0GHz。汽车制造商仅指定产品仅供内部使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号