首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition >All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation
【24h】

All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation

机译:所有您需要的都超出了一个很好的初步:探索更好的解决方案,用于培训极其深深的卷积神经网络,具有正交和调制

获取原文

摘要

Deep neural network is difficult to train and this predicament becomes worse as the depth increases. The essence of this problem exists in the magnitude of backpropagated errors that will result in gradient vanishing or exploding phenomenon. We show that a variant of regularizer which utilizes orthonormality among different filter banks can alleviate this problem. Moreover, we design a backward error modulation mechanism based on the quasi-isometry assumption between two consecutive parametric layers. Equipped with these two ingredients, we propose several novel optimization solutions that can be utilized for training a specific-structured (repetitively triple modules of Conv-BNReLU) extremely deep convolutional neural network (CNN) WITHOUT any shortcuts/ identity mappings from scratch. Experiments show that our proposed solutions can achieve distinct improvements for a 44-layer and a 110-layer plain networks on both the CIFAR-10 and ImageNet datasets. Moreover, we can successfully train plain CNNs to match the performance of the residual counterparts. Besides, we propose new principles for designing network structure from the insights evoked by orthonormality. Combined with residual structure, we achieve comparative performance on the ImageNet dataset.
机译:深神经网络难以训练,随着深度增加,这种困境变得更糟。这个问题的本质存在于背部衰退错误的大小,这将导致梯度消失或爆炸现象。我们表明,在不同滤波器银行之间使用正交性的常规器的变种可以缓解这个问题。此外,我们基于两个连续的参数层之间的准矩阵假设来设计反向误差调制机制。我们配备了这两种成分,我们提出了几种新颖的优化解决方案,可用于训练特定结构(Conv-BnreLu)极端卷积神经网络(CNN)的特定结构(重复的三重模块),而没有从头划痕的任何快捷方式/身份映射。实验表明,我们所提出的解决方案可以在CIFAR-10和ImageNet数据集上实现44层和110层普通网络的不同改进。此外,我们可以成功培训普通CNN以匹配残余对应物的性能。此外,我们提出了从正交性引起的洞察中设计网络结构的新原则。结合剩余结构,我们在想象网数据集中实现了比较表现。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号