首页> 外文学位 >Convexification and Deconvexification for Training Artificial Neural Networks.
【24h】

Convexification and Deconvexification for Training Artificial Neural Networks.

机译:训练人工神经网络的凸化和去凸化。

获取原文
获取原文并翻译 | 示例

摘要

The purpose of this dissertation research is to overcome a fundamental problem in the theory and application of artificial neural networks (ANNs). The problem, called the local minimum problem in training ANNs, has plagued the ANN community since the middle of 1980s.;ANNs trained with backpropagation are extensively utilized to solve various tasks in artificial intelligence fields for decades. The computing power of ANNs is derived through its particularly distributed structure together with the capability to learn and to generalize. However, application and further development of ANNs have been impeded by the local minimum problem and attracted much attention for a very long time.;A primary difficulty of solving the local minimum problem lies in the intrinsic non-convexity of training criteria of ANNs, which usually contain a large number of non-global local minima in their weight spaces. Although an enormous amount of solutions have been developed to optimize the free parameters of the objective function for consistently achieving a better optimum, these methods or algorithms are unable to solve the local minimum problem essentially with the intricate presence of the non-convex function.;To alleviate the fundamental difficulty of the local minimum problem in training ANNs, this dissertation proposes a series of methodologies by applying convexification and deconvexification to avoid non-global local minima and achieve global or near-global minima with satisfactory optimization and generalization performances. These methodologies are developed based on a normalized risk-averting error (NRAE) criterion. The use of this criterion removes the practical difficulty of computational overflow and ill-initialization existed in a risk-averting error criterion, which was the predecessor of the NRAE criterion and has benefits to effectively handle non-global local minima by convexifying the non-convex error space. With employing a proper convexification and deconvexification strategy, it is also uncovered that the NRAE criterion has the advantage in handling high-dimensional non-convex optimization of deep neural networks, which typically suffer from difficulties such like local minima, saddle points, large flat regions, etc. existed in the non-convex error spaces.;In this dissertation, the effectiveness of proposed methods based on the NRAE criterion is first evaluated in training multilayer perceptrons (MLPs) for function approximation tasks, demonstrating the optimization advantage in avoiding or alleviating the local minimum problem compared to the training with the standard mean squared error criterion. Moreover, the NRAE-based training methods that are applied to train convolutional neural networks and deep MLPs for recognizing handwritten digits in the MNIST dataset present better optimization and generalization results than many benchmark performances, which were achieved by integrating different non-convex training criteria and deep learning approaches. At last, to enhance the generalization of the ANN trained by the NRAE-based method, a statistical pruning method that prunes redundant connections of ANN is implemented and experimented for further improving the generalization ability of the ANN trained by the NRAE criterion.
机译:本论文的研究目的是克服人工神经网络理论和应用中的一个基本问题。自1980年代中期以来,这个问题被称为训练ANN的局部最小问题,一直困扰着ANN社区。数十年来,经过反向传播训练的ANN被广泛用于解决人工智能领域的各种任务。人工神经网络的计算能力是通过其特别的分布式结构以及学习和概括能力而获得的。然而,人工神经网络的应用和进一步发展受到局部极小问题的阻碍,并在很长一段时间内引起了人们的广泛关注。解决局部极小问题的主要困难在于人工神经网络训练准则的内在不凸性。通常在其权重空间中包含大量非全局局部最小值。尽管已经开发了大量的解决方案来优化目标函数的自由参数以始终如一地实现更好的最佳效果,但是这些方法或算法基本上不能解决因非凸函数的复杂存在而引起的局部最小值问题。为了缓解训练神经网络中局部极小问题的基本困难,本文提出了一系列的方法,通过凸化和反凸化来避免非全局局部最小值,并获得令人满意的优化和泛化性能的全局或近全局最小值。这些方法是根据标准化的平均风险误差(NRAE)标准开发的。该准则的使用消除了计算上溢的实际困难,而风险平均误差准则中存在的不良初始化是NRAE准则的前身,并且具有通过凸化非凸面来有效处理非全局局部极小值的好处。错误空间。通过采用适当的凸和去凸策略,还发现NRAE准则在处理深度神经网络的高维非凸优化方面具有优势,该神经网络通常会遇到诸如局部极小,鞍点,大平坦区域等难题非凸误差空间中存在等;本文首先在训练用于功能逼近任务的多层感知器(MLP)中评估了基于NRAE准则的方法的有效性,证明了在避免或缓解问题上的优化优势。与使用标准均方误差标准进行的训练相比,局部最小问题。此外,基于NRAE的训练方法可用于训练卷积神经网络和深度MLP,以识别MNIST数据集中的手写数字,比许多基准性能具有更好的优化和泛化结果,这些性能是通过集成不同的非凸训练准则和深度学习方法。最后,为了增强对基于NRAE的方法训练的ANN的泛化,实现了一种对ANN冗余连接进行修剪的统计修剪方法,并进行了实验,以进一步提高由NRAE准则训练的ANN的泛化能力。

著录项

  • 作者

    Gui, Yichuan.;

  • 作者单位

    University of Maryland, Baltimore County.;

  • 授予单位 University of Maryland, Baltimore County.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 179 p.
  • 总页数 179
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号