...
首页> 外文期刊>Progress in Artificial Intelligence >Sequence-To-Sequence Neural Networks Inference on Embedded Processors Using Dynamic Beam Search
【24h】

Sequence-To-Sequence Neural Networks Inference on Embedded Processors Using Dynamic Beam Search

机译:使用动态波束搜索对嵌入处理器的序列到序列神经网络推断

获取原文
获取原文并翻译 | 示例
           

摘要

Sequence-to-sequence deep neural networks have become the state of the art for a variety of machine learning applications, ranging from neural machine translation (NMT) to speech recognition. Many mobile and Internet of Things (IoT) applications would benefit from the ability of performing sequence-to-sequence inference directly in embedded devices, thereby reducing the amount of raw data transmitted to the cloud, and obtaining benefits in terms of response latency, energy consumption and security. However, due to the high computational complexity of these models, specific optimization techniques are needed to achieve acceptable performance and energy consumption on single-core embedded processors. In this paper, we present a new optimization technique called dynamic beam search, in which the inference complexity is tuned to the difficulty of the processed input sequence at runtime. Results based on measurements on a real embedded device, and on three state-of-the-art deep learning models, show that our method is able to reduce the inference time and energy by up to 25% without loss of accuracy.
机译:序列到序列的深神经网络已成为各种机器学习应用的领域,从神经机翻译(NMT)到语音识别。许多移动和内容的东西(IOT)应用程序将受益于直接在嵌入式设备中执行序列到序列推理的能力,从而减少传输到云的原始数据量,并在响应延迟,能量方面获得益处消费和安全。然而,由于这些模型的高计算复杂性,需要特定的优化技术来实现单核嵌入式处理器上可接受的性能和能耗。在本文中,我们介绍了一种名为动态波束搜索的新优化技术,其中推断复杂性被调整为运行时处理的输入序列的难度。结果基于对真实嵌入式设备的测量,以及三种最先进的深度学习模型,表明我们的方法能够将推理时间和能量降低到25%,而不会损失准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号