首页> 外文期刊>Computers & Digital Techniques, IET >Resilient training of neural network classifiers with approximate computing techniques for hardware-optimised implementations
【24h】

Resilient training of neural network classifiers with approximate computing techniques for hardware-optimised implementations

机译:使用近似计算技术对神经网络分类器进行弹性训练,以实现硬件优化的实现

获取原文
获取原文并翻译 | 示例
       

摘要

As Machine Learning applications increase the demand for optimised implementations in both embedded and high-end processing platforms, the industry and research community have been responding with different approaches to implement these solutions. This work presents approximations to arithmetic operations and mathematical functions that, associated with a customised adaptive artificial neural networks training method, based on RMSProp, provide reliable and efficient implementations of classifiers. The proposed solution does not rely on mixed operations with higher precision or complex rounding methods that are commonly applied. The intention of this work is not to find the optimal simplifications for specific deep learning problems but to present an optimised framework that can be used as reliably as one implemented with precise operations, standard training algorithms and the same network structures and hyper-parameters. By simplifying the 'half-precision' floating point format and approximating exponentiation and square root operations, the authors' work drastically reduces the field programmable gate array implementation complexity (e.g. -43 and -57% in two of the component resources). The reciprocal square root approximation is so simple it could be implemented only with combination logic. In a full software implementation for a mixed-precision platform, only two of the approximations compensate the processing overhead of precision conversions.
机译:随着机器学习应用程序对嵌入式和高端处理平台中优化实现的需求不断增加,行业和研究界一直在以不同的方法来响应以实现这些解决方案。这项工作提出了算术运算和数学函数的近似值,这些函数与基于RMSProp的自定义自适应人工神经网络训练方法相关联,可提供可靠且有效的分类器实现。提出的解决方案不依赖于具有较高精度的混合运算或通常采用的复杂舍入方法。这项工作的目的不是要找到针对特定深度学习问题的最佳简化方法,而是要提供一种优化的框架,该框架可以像精确操作,标准训练算法以及相同的网络结构和超参数一样可靠地使用。通过简化“半精度”浮点格式并近似求幂和平方根运算,作者的工作极大地降低了现场可编程门阵列实现的复杂性(例如,在两个组件资源中为-43和-57%)。平方根的倒数是如此简单,只能用组合逻辑来实现。在用于混合精度平台的完整软件实现中,只有两个近似值可以补偿精度转换的处理开销。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号