首页> 外文期刊>Neural Networks: The Official Journal of the International Neural Network Society >Biased Dropout and Crossmap Dropout: Learning towards effective Dropout regularization in convolutional neural network
【24h】

Biased Dropout and Crossmap Dropout: Learning towards effective Dropout regularization in convolutional neural network

机译:偏见辍学和交叉图辍学:在卷积神经网络中了解有效的辍学正规化

获取原文
获取原文并翻译 | 示例
           

摘要

Training a deep neural network with a large number of parameters often leads to overfitting problem. Recently, Dropout has been introduced as a simple, yet effective regularization approach to combat overfitting in such models. Although Dropout has shown remarkable results on many deep neural network cases, its actual effect on CNN has not been thoroughly explored. Moreover, training a Dropout model will significantly increase the training time as it takes longer time to converge than a non-Dropout model with the same architecture. To deal with these issues, we address Biased Dropout and Crossmap Dropout, two novel approaches of Dropout extension based on the behavior of hidden units in CNN model. Biased Dropout divides the hidden units in a certain layer into two groups based on their magnitude and applies different Dropout rate to each group appropriately. Hidden units with higher activation value, which give more contributions to the network final performance, will be retained by a lower Dropout rate, while units with lower activation value will be exposed to a higher Dropout rate to compensate the previous part. The second approach is Crossmap Dropout, which is an extension of the regular Dropout in convolution layer. Each feature map in a convolution layer has a strong correlation between each other, particularly in every identical pixel location in each feature map. Crossmap Dropout tries to maintain this important correlation yet at the same time break the correlation between each adjacent pixel with respect to all feature maps by applying the same Dropout mask to all feature maps, so that all pixels or units in equivalent positions in each feature map will be either dropped or active during training. Our experiment with various benchmark datasets shows that our approaches provide better generalization than the regular Dropout. Moreover, our Biased Dropout takes faster time to converge during training phase, suggesting that assigning noise appropriately in hidden units can lead to an effective regularization. (C) 2018 Elsevier Ltd. All rights reserved.
机译:培训具有大量参数的深度神经网络通常会导致过度的问题。最近,已经引入了辍学作为一种简单但有效的正规化方法,可以在这种模型中打击过度装备。虽然辍学结果在许多深度神经网络案例上显示出显着的结果,但它对CNN的实际效果尚未彻底探索。此外,培训辍学模型将显着提高培训时间,因为它比具有相同架构的非辍学模型更延长的时间。要处理这些问题,我们解决了偏见的辍学和交叉图辍学,基于CNN模型中隐藏单元的行为的两种辍学延伸方法。偏置辍学将某个层中的隐藏单元划分为两组,基于它们的幅度,并适当地将不同的辍学率应用于每个组。具有更高激活值的隐藏单元,为网络最终性能提供更多贡献,将通过较低的差速保留,而具有较低激活值的单位将暴露于更高的丢失率以补偿前一部分。第二种方法是交叉图丢失,这是卷积层中常规辍学的延伸。卷积层中的每个特征图彼此之间具有很强的相关性,特别是在每个特征图中的每个相同的像素位置。 CrossMap丢失尝试维持这种重要的相关性,同时通过将相同的丢弃掩码应用于所有特征映射,使得每个特征映射中的所有像素或单位在每个特征映射中都在所有特征映射中断到所有特征映射之间的相关性。在培训期间将被丢弃或活跃。我们具有各种基准数据集的实验表明,我们的方法提供比常规辍学更好的泛化。此外,我们的偏置辍学需要更快的时间来在训练阶段收敛,这表明在隐藏的单元中适当地分配噪声可以导致有效的正则化。 (c)2018年elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号