首页> 外文会议>Pacific Rim international conference on artificial intelligence >A Meta-Reinforcement Learning Approach to Optimize Parameters and Hyper-parameters Simultaneously
【24h】

A Meta-Reinforcement Learning Approach to Optimize Parameters and Hyper-parameters Simultaneously

机译:同时优化参数和超参数的元增强学习方法

获取原文

摘要

In the last few years, we have witnessed a resurgence of interest in neural networks. The state-of-the-art deep neural network architectures are however challenging to design from scratch and requiring computationally costly empirical evaluations. Hence, there has been a lot of research effort dedicated to effective utilisation and adaptation of previously proposed architectures either by using transfer learning or by modifying the original architecture. The ultimate goal of designing a network architecture is to achieve the best possible accuracy for a given task or group of related tasks. Although there have been some efforts to automate network architecture design process, most of the existing solutions are still very computationally intensive. This work presents a framework to automatically find a good set of hyper-parameters resulting in reasonably good accuracy, which at the same time is less computationally expensive than the existing approaches. The idea presented here is to frame the hyper-parameter selection and tuning within the reinforcement learning regime. Thus, the parameters of a meta-learner, RNN, and hyper-parameters of the target network are tuned simultaneously. Our meta-learner is being updated using policy network and simultaneously generates a tuple of hyper-parameters which are utilized by another network. The network is trained on a given task for a number of steps and produces validation accuracy whose delta is used as reward. The reward along with the state of the network, comprising statistics of network's final layer outcome and training loss, are fed back to the meta-learner which in turn generates a tuned tuple of hyper-parameters for the next time-step. Therefore, the effectiveness of a recommended tuple can be tested very quickly rather than waiting for the network to converge. This approach produces accuracy close to the state-of-the-art approach and is found to be comparatively less computationally intensive.
机译:在过去的几年里,我们目睹了对神经网络感兴趣的复苏。然而,最先进的深度神经网络架构挑战从头开始设计,并且需要计算成本昂贵的经验评估。因此,通过使用传输学习或通过修改原始架构,已经有很多研究精力致力于有效利用和适应先前提出的架构。设计网络架构的最终目标是为特定的任务或相关任务组实现最佳准确性。虽然已经努力自动化网络架构设计过程,但大多数现有解决方案仍然非常重要。这项工作介绍了一个框架,可以自动找到一组好的超参数,从而产生相当好的准确性,同时比现有方法昂贵地计算。此处呈现的想法是在加强学习制度中框架的超参数选择和调整。因此,同时调谐元学习者,RNN和目标网络的超参数的参数。我们的Meta-Learner正在使用策略网络进行更新,同时生成由另一个网络使用的超参数的元组。网络在给定的任务上培训了多个步骤,并产生验证准确性,其Delta用作奖励。随着网络状态的奖励以及网络的最终图层结果和培训损失的统计,被馈送回到元学习者,反过来会为下一个时间步进产生调谐组元组。因此,可以非常快速地测试推荐元组的有效性,而不是等待网络收敛。这种方法产生了接近最先进的方法的精度,并且发现相对较低的计算密集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号