...
首页> 外文期刊>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems >XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference
【24h】

XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference

机译:XNOR神经引擎:用于21.6-fJ / op二进制神经网络推理的硬件加速器IP

获取原文
获取原文并翻译 | 示例
           

摘要

Binary neural networks (BNNs) are promising to deliver accuracy comparable to conventional deep neural networks at a fraction of the cost in terms of memory and energy. In this paper, we introduce the XNOR neural engine (XNE), a fully digital configurable hardware accelerator IP for BNNs, integrated within a microcontroller unit (MCU) equipped with an autonomous I/O subsystem and hybrid SRAM/standard cell memory. The XNE is able to fully compute convolutional and dense layers in autonomy or in cooperation with the core in the MCU to realize more complex behaviors. We show post-synthesis results in 65- and 22-nm technology for the XNE IP and post-layout results in 22 nm for the full MCU indicating that this system can drop the energy cost per binary operation to 21.6 fJ per operation at 0.4 V, and at the same time is flexible and performant enough to execute state-of-the-art BNN topologies such as ResNet-34 in less than 2.2 mJ per frame at 8.9 frames/s.
机译:二进制神经网络(BNN)有望以与内存和能源相比仅几分之一的成本提供与传统深度神经网络相当的准确性。在本文中,我们介绍了XNOR神经引擎(XNE),这是一种用于BNN的全数字可配置硬件加速器IP,集成在配有自治I / O子系统和混合SRAM /标准单元存储器的微控制器单元(MCU)中。 XNE能够以自治方式或与MCU的内核协作来完全计算卷积层和密集层,以实现更复杂的行为。我们显示了针对XNE IP的65和22 nm技术的合成后结果,以及针对完整MCU的22 nm的布局后结果,表明该系统可以将0.4 V时每二进制操作的能量成本降低至每操作21.6 fJ ,同时又具有足够的灵活性和性能,足以在8.9帧/秒的速度下以每帧小于2.2 mJ的速度执行最新的BNN拓扑,例如ResNet-34。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号