首页> 外文OA文献 >Protecting against evaluation overfitting in empirical reinforcement learning
【2h】

Protecting against evaluation overfitting in empirical reinforcement learning

机译:在经验强化学习中防止评估过拟合

摘要

Empirical evaluations play an important role in machine learning. However, the usefulness of any evaluation depends on the empirical methodology employed. Designing good empirical methodologies is difficult in part because agents can overfit test evaluations and thereby obtain misleadingly high scores. We argue that reinforcement learning is particularly vulnerable to environment overfitting and propose as a remedy generalized methodologies, in which evaluations are based on multiple environments sampled from a distribution. In addition, we consider how to summarize performance when scores from different environments may not have commensurate values. Finally, we present proof-of-concept results demonstrating how these methodologies can validate an intuitively useful range-adaptive tile coding method.
机译:实证评估在机器学习中起着重要作用。但是,任何评估的有用性取决于所采用的经验方法。设计良好的经验方法是困难的,部分原因是代理商可能过度拟合测试评估,从而获得令人误解的高分。我们认为强化学习特别容易受到环境过度拟合的影响,并提出了一种补救性的通用方法,其中评估是基于从分布中采样的多个环境进行的。此外,当来自不同环境的分数可能没有相对应的价值时,我们考虑如何总结性能。最后,我们提出概念验证的结果,说明这些方法如何验证一种直观有用的范围自适应瓦片编码方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号