首页> 外文会议>Conference on Uncertainty in Artificial Intelligence >Deep Mixture of Experts via Shallow Embedding
【24h】

Deep Mixture of Experts via Shallow Embedding

机译:通过浅埋嵌入专家的深层混合

获取原文

摘要

Larger networks generally have greater representational power at the cost of increased computational complexity. Sparsifying such networks has been an active area of research but has been generally limited to static regularization or dynamic approaches using reinforcement learning. We explore a mixture of experts (MoE) approach to deep dynamic routing, which activates certain experts in the network on a per-example basis. Our novel DeepMoE architecture increases the representational power of standard convolutional networks by adaptively sparsifying and recalibrating channel-wise features in each convolutional layer. We employ a multi-headed sparse gating network to determine the selection and scaling of channels for each input, leveraging exponential combinations of experts within a single convolutional network. Our proposed architecture is evaluated on four benchmark datasets and tasks, and we show that Deep-MoEs are able to achieve higher accuracy with lower computation than standard convolutional networks.
机译:较大的网络通常具有更大的代表性功率,以增加计算复杂性。缩小这些网络一直是一个活跃的研究领域,但一般限于使用加强学习的静态正则化或动态方法。我们探索专家(MOE)方法对深动态路由的混合,这在每个示例的基础上激活网络中的某些专家。我们的新型DeepMoe架构通过在每个卷积层中自适应地稀释和重新校准渠道方向特征来增加标准卷积网络的代表性力量。我们采用了多头稀疏的门控网络来确定每个输入的频道的选择和缩放,利用单个卷积网络中的专家的指数组合。我们提出的架构是在四个基准数据集和任务中进行评估,我们表明深部门能够实现比标准卷积网络更低的计算更高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号