Pessimism About Unknown Unknowns Inspires Conservatism

Michael K. Cohen; Marcus Hutter

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Pessimism About Unknown Unknowns Inspires Conservatism

【24h】

Pessimism About Unknown Unknowns Inspires Conservatism

机译：关于未知未知的悲观主义激发了保守主义

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

If we could define the set of all bad outcomes, we could hard-code an agent which avoids them; however, in sufficiently complex environments, this is infeasible. We do not know of any general-purpose approaches in the literature to avoiding novel failure modes. Motivated by this, we define an idealized Bayesian reinforcement learner which follows a policy that maximizes the worst-case expected reward over a set of world-models. We call this agent pessimistic, since it optimizes assuming the worst case. A scalar parameter tunes the agent’s pessimism by changing the size of the set of world-models taken into account. Our first main contribution is: given an assumption about the agent’s model class, a sufficiently pessimistic agent does not cause “unprecedented events” with probability $1-delta$, whether or not designers know how to precisely specify those precedents they are concerned with. Since pessimism discourages exploration, at each timestep, the agent may defer to a mentor, who may be a human or some known-safe policy we would like to improve. Our other main contribution is that the agent’s policy’s value approaches at least that of the mentor, while the probability of deferring to the mentor goes to 0. In high-stakes environments, we might like advanced artificial agents to pursue goals cautiously, which is a non-trivial problem even if the agent were allowed arbitrary computing power; we present a formal solution.

机译：如果我们可以定义所有不良结果的集合，我们可以努力代码避免它们的代理;然而，在足够复杂的环境中，这是不可行的。我们不知道文献中的任何通用方法，以避免新的失败模式。受此激励，我们定义了一个理想化的贝叶斯强化学习者，这遵循了一个最大化一系列世界模型的最坏情况预期奖励的政策。我们称此代理悲观，因为它在假设最坏的情况下优化。标量参数通过更改考虑的世界模型集的大小来调整代理的悲观主义。我们的第一个主要贡献是：鉴于代理模型类的假设，一种充分悲观的代理不会导致“前所未有的事件”，概率为1- delta $，无论是设计人员是否知道如何准确指定它们所关注的先例。由于悲观主义劝阻探索，在每个时间，代理人可能会推迟到一个导师，他们可能是我们想要改进的人类或一些已知的安全政策。我们的其他主要贡献是，代理人的政策的价值至少将导师的价值至少接近导师，而推迟到导师的可能性进入0.在高赌注环境中，我们可能喜欢先进的人工代理人谨慎追求目标，这是一个即使允许代理人是任意计算能力，也是非琐碎问题;我们提出了一个正式的解决方案。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2020年第2010期|共30页
作者
Michael K. Cohen; Marcus Hutter;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Known Knowns, Known Unknowns, Unknown Unknowns and Unknown Knowns in DNA Barcoding: A Comment on Dowton et al. [J] . Collins Rupert A., Cruickshank Robert H. Systematic Biology . 2014,第6期

机译：DNA条形码中的已知已知，未知未知，未知未知和未知已知：Dowton等人的评论。
2. Perioperative use of beta-adrenergic antagonists and anemia: known knowns, known unknowns, unknown unknowns; and Unknown Knowns. [J] . Weiskopf RB Anesthesiology . 2010,第1期

机译：围手术期使用β-肾上腺素能拮抗剂和贫血：已知已知，已知未知，未知未知;和未知的已知。
3. Prediction of the unknown: inspiring experience with the CAPRI experiment. [J] . Ben-Zeev E, Berchanski A, Heifetz A, Proteins: Structure, Function, and Genetics . 2003,第1期

机译：未知的预测：CAPRI实验的启发性经验。
4. The Unknown Unknowns Are Not Totally Unknown [C] . David Garlan International Symposium on Software Engineering for Adaptive and Self-Managing Systems . 2021

机译：未知的未知不是完全未知的
5. Deep Face Recognition and Rejection of Unknown Faces [D] . Pandita, Aseem. 2021

机译：深表识别和拒绝未知面孔
6. Cryptic or Silent? The Known Unknowns Unknown Knowns and Unknown Unknowns of Secondary Metabolism [O] . Paul A. Hoskisson, Ryan F. Seipke 2020

机译：神秘或沉默？已知的未知未知的已知人以及次生代谢的未知未知
7. Known Knowns, Known Unknowns, Unknown Unknowns and Unknown Knowns in DNA Barcoding: A Comment on Dowton et al. [O] . Rupert A. Collins, Robert H. Cruickshank 2014

机译：在DNA条形码中已知已知的已知知识，已知未知，未知的未知数和未知的已知知识：对Dowton等人的评论。

Pessimism About Unknown Unknowns Inspires Conservatism

摘要

著录项

相似文献

相关主题

期刊订阅