首页> 外文会议>The 24th IEEE International Symposium on Field-Programmable Custom Computing Machines >Increasing Network Size and Training Throughput of FPGA Restricted Boltzmann Machines Using Dropout
【24h】

Increasing Network Size and Training Throughput of FPGA Restricted Boltzmann Machines Using Dropout

机译:使用Dropout增加FPGA受限的Boltzmann机器的网络规模和培训吞吐量

获取原文
获取原文并翻译 | 示例

摘要

Restricted Boltzmann Machines (RBMs) are widely used in modern machine learning tasks. Existing implementations are limited in network size and training throughput by available DSP resources. In this work we propose a new algorithm and architecture for FPGAs called dropout-RBM (dRBM) system. Compared to the state-of-art design methods on the same FPGA, dRBM with a dropout rate 0.5 doubles the maximum affordable network size using only half of DSP and BRAM resources. This is achieved by an application of a technique called dropout, which is a relatively new method used to avoid overfitting of data. Here we instead apply dropout as a technique for reducing the required DSPs and BRAM resources, while also having the side-effect of increasing robustness of training. Also to improve the processing throughput, we propose a multi-mode matrix multiplication module that maximizes the DSP efficiency. For the MNIST classificationbenchmark, a Stratix IV EP4SGX530 FPGA running dRBM is 34x faster than a single-precision Matlab implementation running on Intel i7 2.9GHz CPU.
机译:受限玻尔兹曼机器(RBM)广泛用于现代机器学习任务中。现有的实现方式在网络规模和培训吞吐量方面受到可用DSP资源的限制。在这项工作中,我们提出了一种称为Dropout-RBM(dRBM)系统的FPGA新算法和体系结构。与同一FPGA上的最新设计方法相比,丢包率为0.5的dRBM仅使用一半的DSP和BRAM资源就能使最大的可负担网络规模增加一倍。这是通过应用称为数据丢失的技术来实现的,该技术是用于避免数据过度拟合的一种相对较新的方法。在这里,我们改为使用辍学作为减少所需DSP和BRAM资源的技术,同时还具有增加训练鲁棒性的副作用。为了提高处理吞吐量,我们提出了一种多模式矩阵乘法模块,该模块可使DSP效率最大化。对于MNIST分类基准,运行dRBM的Stratix IV EP4SGX530 FPGA比运行在Intel i7 2.9GHz CPU上的单精度Matlab实现快34​​倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号