Increasing Network Size and Training Throughput of FPGA Restricted Boltzmann Machines Using Dropout

机译：使用Dropout增加FPGA受限的Boltzmann机器的网络规模和培训吞吐量

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Restricted Boltzmann Machines (RBMs) are widely used in modern machine learning tasks. Existing implementations are limited in network size and training throughput by available DSP resources. In this work we propose a new algorithm and architecture for FPGAs called dropout-RBM (dRBM) system. Compared to the state-of-art design methods on the same FPGA, dRBM with a dropout rate 0.5 doubles the maximum affordable network size using only half of DSP and BRAM resources. This is achieved by an application of a technique called dropout, which is a relatively new method used to avoid overfitting of data. Here we instead apply dropout as a technique for reducing the required DSPs and BRAM resources, while also having the side-effect of increasing robustness of training. Also to improve the processing throughput, we propose a multi-mode matrix multiplication module that maximizes the DSP efficiency. For the MNIST classificationbenchmark, a Stratix IV EP4SGX530 FPGA running dRBM is 34x faster than a single-precision Matlab implementation running on Intel i7 2.9GHz CPU.

机译：受限玻尔兹曼机器（RBM）广泛用于现代机器学习任务中。现有的实现方式在网络规模和培训吞吐量方面受到可用DSP资源的限制。在这项工作中，我们提出了一种称为Dropout-RBM（dRBM）系统的FPGA新算法和体系结构。与同一FPGA上的最新设计方法相比，丢包率为0.5的dRBM仅使用一半的DSP和BRAM资源就能使最大的可负担网络规模增加一倍。这是通过应用称为数据丢失的技术来实现的，该技术是用于避免数据过度拟合的一种相对较新的方法。在这里，我们改为使用辍学作为减少所需DSP和BRAM资源的技术，同时还具有增加训练鲁棒性的副作用。为了提高处理吞吐量，我们提出了一种多模式矩阵乘法模块，该模块可使DSP效率最大化。对于MNIST分类基准，运行dRBM的Stratix IV EP4SGX530 FPGA比运行在Intel i7 2.9GHz CPU上的单精度Matlab实现快34倍。

著录项

来源
《The 24th IEEE International Symposium on Field-Programmable Custom Computing Machines》|2015年|48-51|共4页
会议地点 Washington DC(US)
作者
Jiang Su; David B. Thomas; Peter Y.K. Cheung;
展开▼
作者单位

Dept. of Electr. Electron. Engeering, Imperial Coll. London, London, UK;

Dept. of Electr. Electron. Engeering, Imperial Coll. London, London, UK;

Dept. of Electr. Electron. Engeering, Imperial Coll. London, London, UK;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Training; Digital signal processing; Throughput; Field programmable gate arrays; Neurons; Scalability; Random access memory;

机译：培训;数字信号处理;吞吐量;现场可编程门阵列;神经元;可扩展性;随机存取存储器;

相似文献

外文文献
中文文献
专利

1. A Fully Pipelined FPGA Architecture of a Factored Restricted Boltzmann Machine Artificial Neural Network [J] . LOK-WON KIM, SAMEH ASAAD, RALPH LINSKER ACM transactions on reconfigurable technology and systems . 2014,第1期

机译：受限约束玻尔兹曼机人工神经网络的全流水线FPGA架构
2. A highly parameterizable framework for Conditional Restricted Boltzmann Machine based workloads accelerated with FPGAs and OpenCL [J] . Future generation computer systems . 2020,第Mara期

机译：通过FPGA和OpenCL加速基于条件受限玻尔兹曼机的工作负载的高度可参数化框架
3. Universal Approximation Results for the Temporal Restricted Boltzmann Machine and the Recurrent Temporal Restricted Boltzmann Machine [J] . Simon Odense, Roderick Edwards Journal of machine learning research . 2016,第158期

机译：时间限制玻尔兹曼机和递归时间限制玻尔兹曼机的通用逼近结果
4. Increasing Network Size and Training Throughput of FPGA Restricted Boltzmann Machines Using Dropout [C] . Jiang Su, David B. Thomas, Peter Y.K. Cheung IEEE International Symposium on Field-Programmable Custom Computing Machines . 2016

机译：增加网络尺寸和使用辍学FPGA限制Boltzmann机器的培训吞吐量
5. An implementation of Deep Belief Networks using restricted Boltzmann machines in Clojure. [D] . Sims, James Christopher. 2016

机译：在Clojure中使用受限的Boltzmann机器实现深度信任网络。
6. A deep learning approach for human behavior prediction with explanations in health social networks: social restricted Boltzmann machine (SRBM+) [O] . Nhathai Phan, Dejing Dou, Brigitte Piniewski, -1

机译：在健康社交网络中用于人类行为预测的深度学习方法及其解释：社交受限玻尔兹曼机（SRBM +）
7. Energy-Based Dropout in Restricted Boltzmann Machines: Why Not Go Random [O] . Mateus Roder, Gustavo Henrique de Rosa, Victor Hugo C. de Albuquerque, 2020

机译：受限制的Boltzmann机器中基于能量的辍学：为什么不随意

Increasing Network Size and Training Throughput of FPGA Restricted Boltzmann Machines Using Dropout

摘要

著录项

相似文献

相关主题

期刊订阅