Adaptive dropout for training deep neural networks

机译：深度神经网络的自适应辍学

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recently, it was shown that deep neural networks can perform very well if the activities of hidden units are regularized during learning, e.g, by randomly dropping out 50% of their activities. We describe a method called 'standout' in which a binary belief network is overlaid on a neural network and is used to regularize of its hidden units by selectively setting activities to zero. This 'adaptive dropout network' can be trained jointly with the neural network by approximately computing local expectations of binary dropout variables, computing derivatives using back-propagation, and using stochastic gradient descent. Interestingly, experiments show that the learnt dropout network parameters recapitulate the neural network parameters, suggesting that a good dropout network regularizes activities according to magnitude. When evaluated on the MNIST and NORB datasets, we found that our method achieves lower classification error rates than other feature learning methods, including standard dropout, denoising auto-encoders, and restricted Boltzmann machines. For example, our method achieves 0.80% and 5.8% errors on the MNIST and NORB test sets, which is better than state-of-the-art results obtained using feature learning methods, including those that use convolu-tional architectures.

机译：最近，有人认为，如果在学习期间隐藏单元的活动进行规范化，例如，通过随机丢弃50％的活动，可以表现隐藏单元的活动。我们描述了一种称为“突出终止”的方法，其中二进制信仰网络覆盖在神经网络上，用于通过选择性地将活动设置为零来规范其隐藏单元。该“自适应辍学网络”可以通过大致计算二进制丢失变量的本地期望，使用反向传播的计算导数以及使用随机梯度下降来接受神经网络培训。有趣的是，实验表明，学习的辍学网络参数重新承载神经网络参数，表明良好的丢弃网络根据幅度正规化活动。在MNIST和NORB数据集上进行评估时，我们发现我们的方法达到比其他特征学习方法更低的分类误差率，包括标准辍学，去噪自动编码器和限制的Boltzmann机器。例如，我们的方法在MNIST和NORB测试组上实现了0.80％和5.8％，这比使用特征学习方法获得的最先进结果，包括使用卷积架构的最新结果。

著录项

来源
《Annual conference on Neural Information Processing Systems》|2013年||共9页
会议地点
作者
Lei Jimmy Ba; Brendan Frey;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Rademacher dropout: An adaptive dropout for deep neural network via optimizing generalization gap [J] . Wang Haotian, Yang Wenjing, Zhao Zhenyu, Neurocomputing . 2019,第SEPa10期

机译：Rademacher辍学：通过优化泛化差距来进行深度神经网络的自适应辍学
2. Rademacher dropout: An adaptive dropout for deep neural network via optimizing generalization gap [J] . Wang Haotian, Yang Wenjing, Zhao Zhenyu, Neurocomputing . 2019,第Sepa10期

机译：Rademacher辍学：通过优化泛化差距来实现深神经网络的自适应辍学
3. Adaptive sparse dropout: Learning the certainty and uncertainty in deep neural networks [J] . Chen Yuanyuan, Yi Zhang Neurocomputing . 2021,第Auga25期

机译：自适应稀疏辍学：在深神经网络中学习确定性和不确定性
4. Adaptive dropout for training deep neural networks [C] . Lei Jimmy Ba, Brendan Frey Annual conference on Neural Information Processing Systems . 2013

机译：自适应辍学训练深度神经网络
5. Adaptive dropout for training deep neural networks. [D] . Ba, Jimmy Lei. 2014

机译：用于训练深度神经网络的自适应辍学。
6. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [O] . Arun Narayanan, DeLiang Wang -1

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
7. Ising-dropout: A Regularization Method for Training and Compression of Deep Neural Networks [O] . Hojjat Salehinejad, Shahrokh Valaee 2019

机译：ising-tropout：用于深神经网络的训练和压缩的正则化方法

Adaptive dropout for training deep neural networks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅