...
首页> 外文期刊>Neurocomputing >Deep convolutional neural network architecture design as a bi-level optimization problem
【24h】

Deep convolutional neural network architecture design as a bi-level optimization problem

机译:深度卷积神经网络架构设计作为双层优化问题

获取原文
获取原文并翻译 | 示例
           

摘要

During the last decade, deep neural networks have shown a great performance in many machine learning tasks such as classification and clustering. One of the most successful networks is the CNN (Convolutional Neural Network), which has been applied in many application domains such as pattern recognition, medical diagnosis, and signal processing. Despite the very interesting performance of CNNs, their architecture design is still so far a major challenge for researchers and practitioners. Several works have been proposed in the literature with the aim to find optimized architectures such as ResNet and VGGNet. Unfortunately, most of these architectures are either manually defined by experts or automatically designed by greedy induction algorithms. Recent works suggest the use of Evolutionary Algorithms (EAs) thanks to their ability to escape locally-optimal architectures. Despite the fact that EAs have shown interesting performance, researchers in this direction have considered the design task as a single-level optimization problem; which represents the main research gap we tackle in this paper. The main contribution behind our work consists in the fact that CNN architecture design has a hierarchical nature and thus could be seen as a Bi-Level Optimization Problem (BLOP) where: (1) the upper level minimizes the network complexity defined by the number of blocks and the number of nodes per block; and (2) the lower level optimizes the convolution block & lsquo;graphs & rsquo; topologies by maximizing the classification accuracy. Motivated by the originality of our observation with respect to the state of the art, we frame for the first time the CNN architecture design problem as a BLOP and then solve it using an adapted version of an existing efficient bi-level EA; through the definition of the solution encoding, the fitness function, and the variation operators at each level. The adapted EA is named BLOP-CNN and is assessed on the image classification task using the commonly employed CIFAR-10 and CIFAR-100 benchmark data sets. The analysis of our experimental results show the merits of our proposed method in providing the user with optimized architectures that outperform many recent and prominent architectures coming from the three different approaches, namely: manual design, reinforcement learning-based generation, and evolutionary optimization. Moreover, to show the applicability of our approach, we have conducted a case study on the detection of the COVID-19 using a set of benchmark chest X-ray and Computed Tomography (CT) images.(c) 2021 Elsevier B.V. All rights reserved.
机译:在过去的十年中,深度神经网络在许多机器学习任务中显示了良好的性能,例如分类和聚类。其中一个最成功的网络是CNN(卷积神经网络),其已应用于许多应用领域,例如模式识别,医学诊断和信号处理。尽管CNNS的表现非常有趣,但他们的建筑设计仍然是研究人员和从业者的主要挑战。在文献中提出了几种作品,目的是找到优化的架构,如Reset和VgGnet。不幸的是,大多数这些架构由专家手动定义,或者由贪婪的感应算法自动设计。最近的作品建议使用进化算法(EAS)了解他们逃避本地最佳架构的能力。尽管EAS表达了有趣的表现,但在这个方向上的研究人员认为设计任务是单级优化问题;这代表了我们在本文中解决的主要研究差距。我们工作背后的主要贡献包括:CNN架构设计具有分层性质,因此可以被视为双级优化问题(截图),其中:(1)上层最小化由数量定义的网络复杂度块和每个块节点的数量; (2)较低级别优化卷积块和LSQU;图和rsquo;通过最大化分类准确性来拓扑。通过我们对现有技术的观察的原创性的推动,我们首次框架CNN架构设计问题作为窗口,然后使用现有的高效双级EA的适应版本来解决它;通过对每个级别的解决方案编码,健身功能和变化算子的定义。适应的EA被命名为BLOP-CNN,并使用常用的CIFAR-10和CIFAR-100基准数据集进行评估在图像分类任务上。对我们的实验结果的分析表明了我们提出的方法,以便为用户提供优化的架构,优化了来自三种不同方法的许多最近和突出的架构,即:手动设计,基于加强学习的生成和进化优化。此外,为了表明我们的方法的适用性,我们使用一组基准胸部X射线和计算机断层扫描(CT)图像进行了关于检测Covid-19的案例研究。(c)2021 Elsevier BV保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号