A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes

机译：您需要的微控制器：在低功耗IOT Endnodes上启用变压器执行

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Transformer networks have become state-of-the-art for many tasks such as NLP and are closing the gap on other tasks like image recognition. Similarly, Transformers and Attention methods are starting to attract attention on smaller-scale tasks, which fit the typical memory envelope of MCUs. In this work, we propose a new set of execution kernels tuned for efficient execution on MCU-class RISC-V and ARM Cortex-M cores. We focus on minimizing memory movements while maximizing data reuse in the Attention layers. With our library, we obtain 3.4×, 1.8×, and 2.1× lower latency and energy on 8-bit Attention layers, compared to previous state-of-the-art (SoA) linear and matrix multiplication kernels in the CMSIS-NN and PULP-NN libraries on the STM32H7 (Cortex M7), STM32L4 (Cortex M4), and GAP8 (RISC-V IMC-Xpulp) platforms, respectively. As a use case for our TinyTransformer library, we also demonstrate that we can fit a 263 kB Transformer on the GAP8 platform, outperforming the previous SoA convolutional architecture on the TinyRadarNN dataset, with a latency of 9.24 ms and 0.47 mJ energy consumption and an accuracy improvement of 3.5%.

机译：对于许多任务（如NLP）而言，变压器网络已经成为最先进的，并且正在关闭图像识别等其他任务的间隙。同样，变压器和注意方法开始对较小规模的任务引起注意，这适合MCU的典型内存包络。在这项工作中，我们提出了一组新的执行内核，可在MCU-Class RISC-V和ARM Cortex-M内核上进行高效执行。我们专注于最小化内存移动，同时最大化注意力层中的数据重用。与我们的图书馆一起获得3.4倍，1.8倍和2.1倍的延迟和能量，而在CMSI-NN中以前的最先进（SOA）线性和矩阵乘法内核相比STM32H7（Cortex M7），STM32L4（Cortex M4）和GAP8（RISC-V IMC-XPULP）平台上的纸浆-NN库。作为我们的TinyTransformer库的用例，我们还证明我们可以在GAP8平台上安装263 KB变压器，优于Tinyradarnn数据集上的先前SOA卷积架构，延迟为9.24 ms和0.47 MJ能耗和精度。提高3.5％。

著录项

来源
《International Conference on Omni-layer Intelligent Systems》|2021年|1-6|共6页
会议地点
作者
Alessio Burrello; Moritz Scherer; Marcello Zanghieri; Francesco Conti; Luca Benini;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Energy consumption; Image recognition; Microcontrollers; Conferences; Transformer cores; Libraries; Classification algorithms;

机译：能量消耗;图像识别;微控制器;会议;变形金刚核心;图书馆;分类算法;

相似文献

外文文献
中文文献
专利

1. Low-Power Fast Partial Firmware Update Technique of On-Chip Flash Memory for Reliable Embedded IoT Microcontroller [J] . Kwon Jisu, Seok Moon Gi, Park Daejin IEICE Transactions on Electronics . 2021,第6期

机译：低功耗快速部分固件更新技术的片上闪存可靠嵌入式物联网微控制器
2. A Low-Power Microcontroller with Accuracy-Controlled Event-Driven Signal Processing Unit for Rare-Event Activity-Sensing IoT Devices [J] . Park Daejin, Youn Jonghee M., Cho Jeonghun Journal of Sensors . 2015,第Pta3期

机译：低功率微控制器，带有精度控制的事件驱动信号处理单元，用于稀有事件活动感应物联网设备
3. A Low-Power Microcontroller with Accuracy-Controlled Event-Driven Signal Processing Unit for Rare-Event Activity-Sensing IoT Devices [J] . DaejinPark, Jonghee M.Youn, JeonghunCho Journal of Sensors . 2015,第1期

机译：低功率微控制器，带有精度控制的事件驱动信号处理单元，用于稀有事件活动感测物联网设备
4. Software execution freeze-safe microcontroller using power profile tracking for IoT-driven connected services [C] . Hyeongrae Kim, Dongkyu Lee, Jeonghun Cho, IEEE World Forum on Internet of Things . 2018

机译：使用电源配置文件跟踪的物联网驱动连接服务的软件执行防冻微控制器
5. Experimental Testing of Microcontroller Based Protection for Three Phase Power Distribution Transformer. [D] . Dar, Imran Riaz. 2011

机译：基于微机的三相配电变压器保护实验测试。
6. Seat Occupancy Detection Based on a Low-Power Microcontroller and a Single FSR [O] . Ernesto Sifuentes, Rafael Gonzalez-Landaeta, Juan Cota-Ruiz, 2019

机译：基于低功率微控制器和单个FSR的座椅占用检测
7. Enabling End-to-End Secure Connectivity for Low-Power IoT Devices with UAVs [O] . Archana Rajakaruna, Ahsan Manzoor, Pawani Porambage, 2019

机译：为具有UAV的低功耗IOT设备启用端到端的安全连接

A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes

摘要

著录项

相似文献

相关主题

期刊订阅