首页> 外文会议>International Conference on Internet-of-Things Design and Implementation >Fast and Accurate Streaming CNN Inference via Communication Compression on the Edge
【24h】

Fast and Accurate Streaming CNN Inference via Communication Compression on the Edge

机译:通过边缘上的通信压缩快速准确地流式传输CNN推断

获取原文

摘要

Recently, compact CNN models have been developed to enable computer vision on the edge. While the small model size reduces the storage overhead and the light-weight layer operations alleviate the burden of the edge processors, it is still challenging to sustain high inference performance due to limited and varying inter-device bandwidth. We propose a streaming inference framework to simultaneously improve throughput and accuracy by communication compression. Specifically, we perform the following optimizations: 1) Partition: we split the CNN layers such that the devices achieve computation load-balance; 2) Compression: we identify inter-device communication bottlenecks and insert Auto-Encoders into the original CNN to compress data traffic; 3) Scheduling: we adaptively select the compression ratio when the variation of bandwidth is large. The above optimizations improve inference throughput significantly due to better communication performance. More importantly, accuracy also increases since 1) fewer frames are dropped when input images are streamed in at a high rate, and 2) the frames successfully entering the pipeline are processed accurately since the AE-based compression incurs negligible information loss. We evaluate MobileNet-v2 on pipeline of Raspberry Pi 3B+. Our compression techniques lead to up to 32% accuracy improvement, when average Wi-Fi bandwidth varies from 3 to 9Mbps.
机译:最近,已经开发了紧凑的CNN模型来实现边缘上的计算机视觉。虽然小型模型尺寸减少了存储开销,并且轻量级层操作缓解了边缘处理器的负担,但由于设备间带宽而维持高推理性能仍然具有挑战性。我们提出了一种流式推断框架,通过通信压缩同时提高吞吐量和准确性。具体而言,我们执行以下优化:1)分区:我们拆分CNN层,使得设备实现计算负载平衡; 2)压缩:我们识别设备间通信瓶颈并将自动编码器插入原始CNN以压缩数据流量; 3)调度:当带宽的变化大时,我们自适应地选择压缩比。由于更好的通信性能,上述优化显着提高推理吞吐量。更重要的是,精确度也会增加,因为当输入图像以高速流流过输入图像时,丢失帧较少,并且2)准确地处理成功进入流水线的帧,因为基于AE的压缩引起可忽略不可计量的信息丢失。我们评估覆盆子PI 3B +管道上的MobileNet-V2。当平均Wi-Fi带宽从3到9Mbps变化时,我们的压缩技术可导致高达32%的准确性改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号