首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Pessimism About Unknown Unknowns Inspires Conservatism
【24h】

Pessimism About Unknown Unknowns Inspires Conservatism

机译:关于未知未知的悲观主义激发了保守主义

获取原文
       

摘要

If we could define the set of all bad outcomes, we could hard-code an agent which avoids them; however, in sufficiently complex environments, this is infeasible. We do not know of any general-purpose approaches in the literature to avoiding novel failure modes. Motivated by this, we define an idealized Bayesian reinforcement learner which follows a policy that maximizes the worst-case expected reward over a set of world-models. We call this agent pessimistic, since it optimizes assuming the worst case. A scalar parameter tunes the agent’s pessimism by changing the size of the set of world-models taken into account. Our first main contribution is: given an assumption about the agent’s model class, a sufficiently pessimistic agent does not cause “unprecedented events” with probability $1-delta$, whether or not designers know how to precisely specify those precedents they are concerned with. Since pessimism discourages exploration, at each timestep, the agent may defer to a mentor, who may be a human or some known-safe policy we would like to improve. Our other main contribution is that the agent’s policy’s value approaches at least that of the mentor, while the probability of deferring to the mentor goes to 0. In high-stakes environments, we might like advanced artificial agents to pursue goals cautiously, which is a non-trivial problem even if the agent were allowed arbitrary computing power; we present a formal solution.
机译:如果我们可以定义所有不良结果的集合,我们可以努力代码避免它们的代理;然而,在足够复杂的环境中,这是不可行的。我们不知道文献中的任何通用方法,以避免新的失败模式。受此激励,我们定义了一个理想化的贝叶斯强化学习者,这遵循了一个最大化一系列世界模型的最坏情况预期奖励的政策。我们称此代理悲观,因为它在假设最坏的情况下优化。标量参数通过更改考虑的世界模型集的大小来调整代理的悲观主义。我们的第一个主要贡献是:鉴于代理模型类的假设,一种充分悲观的代理不会导致“前所未有的事件”,概率为1- delta $,无论是设计人员是否知道如何准确指定它们所关注的先例。由于悲观主义劝阻探索,在每个时间,代理人可能会推迟到一个导师,他们可能是我们想要改进的人类或一些已知的安全政策。我们的其他主要贡献是,代理人的政策的价值至少将导师的价值至少接近导师,而推迟到导师的可能性进入0.在高赌注环境中,我们可能喜欢先进的人工代理人谨慎追求目标,这是一个即使允许代理人是任意计算能力,也是非琐碎问题;我们提出了一个正式的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号