首页> 外文期刊>Neurocomputing >Adaptive multi-teacher multi-level knowledge distillation
【24h】

Adaptive multi-teacher multi-level knowledge distillation

机译:自适应多师多级知识蒸馏

获取原文
获取原文并翻译 | 示例

摘要

Knowledge distillation (KD) is an effective learning paradigm for improving the performance of light-weight student networks by utilizing additional supervision knowledge distilled from teacher networks. Most pioneering studies either learn from only a single teacher in their distillation learning methods, neglecting the potential that a student can learn from multiple teachers simultaneously, or simply treat each teacher to be equally important, unable to reveal the different importance of teachers for specific examples. To bridge this gap, we propose a novel adaptive multi-teacher multi-level knowledge distillation learning framework (AMTML-KD), which consists two novel insights: (i) associating each teacher with a latent representation to adaptively learn instance-level teacher importance weights which are leveraged for acquiring integrated soft-targets (high-level knowledge) and (ii) enabling the intermediate-level hints (intermediate-level knowledge) to be gathered from multiple teachers by the proposed multi-group hint strategy. As such, a student model can learn multi-level knowledge from multiple teachers through AMTML-KD. Extensive results on publicly available datasets demonstrate the proposed learning framework ensures student to achieve improved performance than strong competitors. (C) 2020 Elsevier B.V. All rights reserved.
机译:知识蒸馏(KD)是一种有效的学习范式,用于通过利用教师网络蒸馏的额外监督知识来提高轻量级学生网络的性能。最开拓的研究要么从蒸馏学习方法中才能从一位老师那里学习,忽略了学生可以同时从多个教师学习的潜力,或者只是把每位教师视为同样重要的,无法揭示教师对具体例子的不同重要性。要弥补这一差距,我们提出了一种新颖的自适应多师多级知识蒸馏学习框架(AMTML-KD),它由两种小说洞察力组成:(i)将每个教师与潜在的代表相关联,以便自适应地学习实例级教师重要性用于获取集成软目标(高级知识)和(ii)的重量,从而通过所提出的多组提示策略从多个教师收集中间级提示(中级级知识)。因此,学生模型可以通过AMTML-KD从多个教师学习来自多个教师的多级知识。公开的数据集上的广泛结果展示了建议的学习框架,确保学生实现的性能而不是强大的竞争对手。 (c)2020 Elsevier B.v.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2020年第20期|106-113|共8页
  • 作者

    Liu Yuang; Zhang Wei; Wang Jun;

  • 作者单位

    East China Normal Univ 3663 North Zhongshan Rd Shanghai 200062 Peoples R China;

    East China Normal Univ 3663 North Zhongshan Rd Shanghai 200062 Peoples R China;

    East China Normal Univ 3663 North Zhongshan Rd Shanghai 200062 Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Knowledge distillation; Adaptive learning; Multi-teacher;

    机译:知识蒸馏;自适应学习;多老师;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号