首页> 外文会议>IEEE International Symposium on Circuits and Systems >Accelerating Deep Neural Network Computation on a Low Power Reconfigurable Architecture
【24h】

Accelerating Deep Neural Network Computation on a Low Power Reconfigurable Architecture

机译:在低功耗可重配置架构上加速深度神经网络计算

获取原文

摘要

Recent work on neural network architectures has focused on bridging the gap between performance/efficiency and programmability. We consider implementations of three popular neural networks, ResNet, AlexNet and ASGD weight-dropped Recurrent Neural Network (AWD RNN) on a low power programmable architecture, Transformer. The architecture consists of light-weight cores interconnected by caches and crossbars that support run-time reconfiguration between shared and private cache mode operations. We present efficient implementations of key neural network kernels and evaluate the performance of each kernel when operating in different cache modes. The best-performing cache modes are then used in the implementation of the end-to-end network. Simulation results show superior performance with ResNet, AlexNet and AWD RNN achieving 188.19 GOPS/W, 150.53 GOPS/W and 120.68 GOPS/W, respectively, in the 14 nm technology node.
机译:关于神经网络体系结构的最新工作集中在弥合性能/效率与可编程性之间的鸿沟上。我们考虑在低功耗可编程架构Transformer上实现三种流行的神经网络ResNet,AlexNet和ASGD减重递归神经网络(AWD RNN)。该体系结构由轻量级内核组成,这些轻量级内核通过缓存和交叉开关互连,支持共享和专用缓存模式操作之间的运行时重新配置。我们介绍了关键神经网络内核的有效实现,并评估了在不同缓存模式下运行时每个内核的性能。然后,在端到端网络的实现中使用性能最佳的缓存模式。仿真结果表明,在14 nm技术节点上,ResNet,AlexNet和AWD RNN分别具有188.19 GOPS / W,150.53 GOPS / W和120.68 GOPS / W的卓越性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号