首页> 外文学位 >Hyperparameter Optimization of Deep Convolutional Neural Networks Architectures for Object Recognition
【24h】

Hyperparameter Optimization of Deep Convolutional Neural Networks Architectures for Object Recognition

机译:深度卷积神经网络体系结构用于对象识别的超参数优化

获取原文
获取原文并翻译 | 示例

摘要

Recent advances in Convolutional Neural Networks (CNNs) have obtained promising results in difficult deep learning tasks. However, the success of a CNN depends on finding an architecture to fit a given problem. A hand-crafted architecture is a challenging, time-consuming process that requires expert knowledge and effort, due to a large number of architectural design choices. In this dissertation, we present an efficient framework that automatically designs a high-performing CNN architecture for a given problem. In this framework, we introduce a new optimization objective function that combines the error rate and the information learnt by a set of feature maps using deconvolutional networks (deconvnet). The new objective function allows the hyperparameters of the CNN architecture to be optimized in a way that enhances the performance by guiding the CNN through better visualization of learnt features via deconvnet. The actual optimization of the objective function is carried out via the Nelder-Mead Method (NMM). Further, our new objective function results in much faster convergence towards a better architecture. The proposed framework has the ability to explore a CNN architecture's numerous design choices in an efficient way and also allows effective, distributed execution and synchronization via web services. Empirically, we demonstrate that the CNN architecture designed with our approach outperforms several existing approaches in terms of its error rate. Our results are also competitive with state-of-the-art results on the MNIST dataset and perform reasonably against the state-of-the-art results on CIFAR-10 and CIFAR-100 datasets. Our approach has a significant role in increasing the depth, reducing the size of strides, and constraining some convolutional layers not followed by pooling layers in order to find a CNN architecture that produces a high recognition performance.;Moreover, we evaluate the effectiveness of reducing the size of the training set on CNNs using a variety of instance selection methods to speed up the training time. We then study how these methods impact classification accuracy. Many instance selection methods require a long run-time to obtain a subset of the representative dataset, especially if the training set is large and has a high dimensionality. One example of these algorithms is Random Mutation Hill Climbing (RMHC). We improve RMHC so that it performs faster than the original algorithm with the same accuracy.
机译:卷积神经网络(CNN)的最新进展在困难的深度学习任务中取得了可喜的成果。但是,CNN的成功取决于找到适合给定问题的体系结构。由于存在大量的建筑设计选择,手工制作的建筑是一个挑战性,耗时的过程,需要专家的知识和精力。在本文中,我们提出了一个有效的框架,可以针对给定的问题自动设计高性能的CNN架构。在此框架中,我们引入了一个新的优化目标函数,该函数结合了错误率和使用反卷积网络(deconvnet)通过一组特征图学习的信息。新的目标函数允许CNN架构的超参数以优化性能的方式进行优化,该方法通过通过deconvnet更好地可视化学习的特征来引导CNN,从而提高了性能。目标函数的实际优化是通过Nelder-Mead方法(NMM)进行的。此外,我们新的目标函数可以更快地收敛到更好的体系结构。所提出的框架具有以有效的方式探索CNN架构的众多设计选择的能力,并且还允许通过Web服务进行有效的分布式执行和同步。从经验上讲,我们证明了使用我们的方法设计的CNN架构在错误率方面优于几种现有方法。我们的结果也与MNIST数据集上的最新结果具有竞争力,并且与CIFAR-10和CIFAR-100数据集上的最新结果相比具有合理的表现。我们的方法在增加深度,减小步幅的大小以及限制一些卷积层而不是池化层方面具有重要作用,以便找到可产生高识别性能的CNN架构。此外,我们评估了减少卷积的有效性使用各种实例选择方法在CNN上训练集的大小,以加快训练时间。然后,我们研究这些方法如何影响分类准确性。许多实例选择方法需要较长的运行时间才能获得代表性数据集的子集,尤其是在训练集较大且维数较高的情况下。这些算法的一个示例是随机变异爬山(RMHC)。我们改进了RMHC,使其以相同的精度比原始算法执行得更快。

著录项

  • 作者

    Albelwi, Saleh.;

  • 作者单位

    University of Bridgeport.;

  • 授予单位 University of Bridgeport.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2018
  • 页码 105 p.
  • 总页数 105
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 农业化学;
  • 关键词

  • 入库时间 2022-08-17 11:53:05

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号