COACH: Learning continuous actions from COrrective Advice Communicated by Humans

机译：教练：从人类沟通的纠正性建议中学习持续的行动

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

COACH (COrrective Advice Communicated by Humans), a new interactive learning framework that allows non-expert humans to shape a policy through corrective advice, using a binary signal in the action domain of the agent, is proposed. One of the main innovative features of COACH is a mechanism for adaptively adjusting the amount of human feedback that a given action receives, taking into consideration past feedback. The performance of COACH is compared with the one of TAMER (Teaching an Agent Manually via Evaluative Reinforcement), ACTAMER (Actor-Critic TAMER), and an autonomous agent trained using SARSA(?) in two reinforcement learning problems. COACH outperforms all other learning frameworks in the reported experiments. In addition, results show that COACH is able to transfer successfully human knowledge to agents with continuous actions, being a complementary approach to TAMER, which is appropriate for teaching in discrete action domains.

机译：提出了一种新的交互式学习框架COACH（人为交流的纠正性建议），该框架允许非专家使用代理的作用域中的二进制信号通过纠正性建议来制定策略。 COACH的主要创新功能之一是一种机制，可以在考虑到过去的反馈的情况下，自适应地调整给定动作收到的人类反馈的数量。在两个强化学习问题中，将COACH的性能与TAMER（通过评估强化手动教学代理），ACTAMER（演员-关键TAMER）和使用SARSA（？）训练的自主代理中的一个进行了比较。在所报告的实验中，COACH的性能优于所有其他学习框架。此外，结果表明，COACH能够将人类知识成功地传递给具有连续动作的主体，这是TAMER的一种补充方法，适用于离散动作领域中的教学。

著录项

来源
《International Conference on Advanced Robotics》|2015年|581-586|共6页
会议地点
作者
Celemin Carlos; Ruiz-del-Solar Javier;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Robot learning; human feedback in action domains; human teachers; interactive learning;

机译：机器人学习;行动领域中的人类反馈;人类教师;交互式学习;

相似文献

外文文献
中文文献
专利

1. Reinforcement learning of motor skills using Policy Search and human corrective advice [J] . The International journal of robotics research . 2019,第14期

机译：使用策略搜索和人工纠正建议加强运动技能的学习
2. An Interactive Framework for Learning Continuous Actions Policies Based on Corrective Feedback [J] . Celemin Carlos, Ruiz-del-Solar Javier Journal of Intelligent & Robotic Systems: Theory & Application . 2019,第1期

机译：基于纠正反馈的互动持续行动策略的交互式框架
3. Learning via human feedback in continuous state and action spaces [J] . Ngo Anh Vien, Wolfgang Ertel, Tae Choong Chung Applied Intelligence . 2013,第2期

机译：通过人类反馈在连续状态和动作空间中学习
4. COACH: Learning continuous actions from COrrective Advice Communicated by Humans [C] . Celemin Carlos, Ruiz-del-Solar Javier International Conference on Advanced Robotics . 2015

机译：教练：从人类传达的纠正建议中学习持续行动
5. Coaching: Learning and using environment and agent models for advice. [D] . Riley, Patrick. 2005

机译：指导：学习和使用环境和代理模型以获取建议。
6. Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions [O] . Minija Tamosiunaite, Tamim Asfour, Florentin Wörgötter -1

机译：通过使用连续动作的基于受体场的函数逼近方法通过强化学习来学习达到
7. Human interaction with technology for working, communicating, and learning: advancements [O] . 2012

机译：人与工作，交流和学习技术的互动：进步
8. Coaching: Learning and Using Environment and Agent Models for Advice [R] . Riley, P. 2005

机译：辅导：学习和使用环境和代理模型的建议

COACH: Learning continuous actions from COrrective Advice Communicated by Humans

摘要

著录项

相似文献

相关主题

期刊订阅