首页> 外文期刊>Neurocomputing >Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems
【24h】

Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems

机译:有界激活函数可增强深度神经网络在视觉模式识别问题上的训练稳定性

获取原文
获取原文并翻译 | 示例

摘要

This paper focuses on the enhancement of the generalization ability and training stability of deep neural networks (DNNs). New activation functions that we call bounded rectified linear unit (ReLU), bounded leaky ReLU, and bounded bi-firing are proposed. These activation functions are defined based on the desired properties of the universal approximation theorem (UAT). An additional work on providing a new set of coefficient values for the scaled hyperbolic tangent function is also presented. These works result in improved classification performances and training stability in DNNs. Experimental works using the multilayer perceptron (MLP) and convolutional neural network (CNN) models have shown that the proposed activation functions outperforms their respective original forms in regards to the classification accuracies and numerical stability. Tests on MNIST, mnist-rot-bg-img handwritten digit, and AR Purdue face databases show that significant improvements of 17.31%, 9.19%, and 74.99% can be achieved in terms of the testing misclassification error rates (MCRs), applying both mean squared error (MSE) and cross entropy (CE) loss functions This is done without sacrificing the computational efficiency. With the MNIST dataset, bounding the output of an activation function results in a 78.58% reduction in numerical instability, and with the mnist-rot-bg-img and AR Purdue databases the problem is completely eliminated. Thus, this work has demonstrated the significance of bounding an activation function in helping to alleviate the training instability problem when training a DNN model (particularly CNN). (C) 2016 Elsevier B.V. All rights reserved.
机译:本文着重于增强深度神经网络(DNN)的泛化能力和训练稳定性。提出了新的激活函数,我们称为有界整流线性单元(ReLU),有界泄漏ReLU和有界双向点火。这些激活函数是基于通用逼近定理(UAT)的所需属性定义的。还介绍了为缩放的双曲正切函数提供一组新的系数值的其他工作。这些工作可提高DNN的分类性能和训练稳定性。使用多层感知器(MLP)和卷积神经网络(CNN)模型的实验工作表明,在分类精度和数值稳定性方面,拟议的激活函数优于其各自的原始形式。在MNIST,mnist-rot-bg-img手写数字和AR Purdue面部数据库上进行的测试表明,在测试错误分类错误率(MCR)方面,可以同时实现17.31%,9.19%和74.99%的显着改进。均方误差(MSE)和交叉熵(CE)损失函数这是在不牺牲计算效率的情况下完成的。使用MNIST数据集,对激活函数的输出进行限制会导致数值不稳定性降低78.58%,并且使用mnist-rot-bg-img和AR Purdue数据库可以完全解决此问题。因此,这项工作证明了在激活DNN模型(尤其是CNN)时,限制激活函数有助于减轻训练不稳定问题的重要性。 (C)2016 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号