首页> 外文期刊>Solid-State Circuits, IEEE Journal of >KiloCore: A 32-nm 1000-Processor Computational Array
【24h】

KiloCore: A 32-nm 1000-Processor Computational Array

机译:KiloCore:32纳米1000处理器计算阵列

获取原文
获取原文并翻译 | 示例
           

摘要

A processor array containing 1000 independent processors and 12 memory modules was fabricated in 32-nm partially depleted silicon on insulator CMOS. The programmable processors occupy 0.055 mm2 each, contain no algorithmspecific hardware, and operate up to an average maximum clock frequency of 1.78 GHz at 1.1 V. At 0.9 V, processors operating at an average of 1.24 GHz dissipate 17 mW while issuing one instruction per cycle. At 0.56 V, processors operating at an average of 115 MHz dissipate 0.61 mW while issuing one instruction per cycle, resulting in an energy consumption of 5.3 pJ/instruction. On-die communication is performed by complementary circuit and packet-based networks that yield a total array bisection bandwidth of 4.2 Tb/s. Independent memory modules handle data and instructions and operate up to an average maximum clock frequency of 1.77 GHz at 1.1 V. All processors, their packet routers, and the memory modules contain unconstrained clock oscillators within independent clock domains that adapt to large supply voltage noise. Compared with a variety of Intel i7s and Nvidia GPUs, the KiloCore at 1.1 V has geometric mean improvements of 4.3× higher throughput per area and 9.4× higher energy efficiency for AES encryption, 4095-b low-density parity-check decoding, 4096-point complex fast Fourier transform, and 100-B record sorting applications.
机译:包含1000个独立处理器和12个内存模块的处理器阵列是在绝缘体CMOS上的32-nm部分耗尽硅中制造的。可编程处理器每个占0.055 mm2,不包含算法专用的硬件,并且在1.1 V时的平均最大时钟频率为1.78 GHz。在0.9 V时,平均为1.24 GHz的处理器的功耗为17 mW,同时每个周期发出一条指令。在0.56 V的电压下,平均工作在115 MHz的处理器在每个周期发出一条指令时的功耗为0.61 mW,导致每条指令的能耗为5.3 pJ。片上通信由互补电路和基于分组的网络执行,产生的总阵列二等分带宽为4.2 Tb / s。独立的存储器模块处理数据和指令,并在1.1 V时运行时的平均最大时钟频率为1.77 GHz。所有处理器,其分组路由器和存储器模块均在独立的时钟域内包含不受约束的时钟振荡器,以适应较大的电源电压噪声。与各种Intel i7s和Nvidia GPU相比,KiloCore在1.1 V时的几何平均性能提高了4.3倍,单位面积吞吐量提高了9.4倍,用于AES加密的能源效率提高了9.4倍,实现了4095-b低密度奇偶校验解码,4096-点复数快速傅立叶变换和100-B记录排序应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号