首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >MTL-NAS: Task-Agnostic Neural Architecture Search Towards General-Purpose Multi-Task Learning
【24h】

MTL-NAS: Task-Agnostic Neural Architecture Search Towards General-Purpose Multi-Task Learning

机译:MTL-NAS:面向任务的神经体系结构搜索,面向通用的多任务学习

获取原文

摘要

We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL). Existing NAS methods typically define different search spaces according to different tasks. In order to adapt to different task combinations (i.e., task sets), we disentangle the GP-MTL networks into single-task backbones (optionally encode the task priors), and a hierarchical and layerwise features sharing/fusing scheme across them. This enables us to design a novel and general task-agnostic search space, which inserts cross-task edges (i.e., feature fusion connections) into fixed single-task network backbones. Moreover, we also propose a novel single-shot gradient-based search algorithm that closes the performance gap between the searched architectures and the final evaluation architecture. This is realized with a minimum entropy regularization on the architecture weights during the search phase, which makes the architecture weights converge to near-discrete values and therefore achieves a single model. As a result, our searched model can be directly used for evaluation without (re-)training from scratch. We perform extensive experiments using different single-task backbones on various task sets, demonstrating the promising performance obtained by exploiting the hierarchical and layerwise features, as well as the desirable generalizability to different i) task sets and ii) single-task backbones. The code of our paper is available at https://github.com/bhpfelix/MTLNAS.
机译:我们建议将神经结构搜索(NAS)纳入通用的多任务学习(GP-MTL)。现有NAS方法通常根据不同的任务定义不同的搜索空间。为了适应不同的任务组合(即,任务集),我们将GP-MTL网络解开到单任务备底座(可选地编码任务指引),以及分层和分层特征在它们上共享/融合方案。这使我们能够设计一种新颖和一般的任务不可行的搜索空间,该搜索空间将交叉任务边缘(即,特征融合连接)插入固定的单任务网络骨干网。此外,我们还提出了一种新颖的单次梯度基搜索算法,其关闭搜索到的体系结构和最终评估架构之间的性能差距。这在搜索阶段期间在体系结构权重上实现了最小熵正则化,这使得体系结构权重聚到近离散值,因此实现单个模型。因此,我们搜索的模型可以直接用于评估而不从划痕训练。我们在各种任务集上使用不同的单任务骨架进行广泛的实验,展示通过利用分层和分层特征而获得的有希望的性能,以及对不同I)任务集和II)单任务备份的所需概括性。我们的论文的代码可在https://github.com/bhpfelix/mtlnas获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号