首页> 外文期刊>IEEE Transactions on Cognitive and Developmental Systems >Learning From Explanations Using Sentiment and Advice in RL
【24h】

Learning From Explanations Using Sentiment and Advice in RL

机译:在RL中使用情感和建议从解释中学习

获取原文
获取原文并翻译 | 示例
       

摘要

In order for robots to learn from people with no machine learning expertise, robots should learn from natural human instruction. Most machine learning techniques that incorporate explanations require people to use a limited vocabulary and provide state information, even if it is not intuitive. This paper discusses a software agent that learned to play the Mario Bros. game using explanations. Our goals to improve learning from explanations were twofold: 1) to filter explanations into advice and warnings and 2) to learn policies from sentences without state information. We used sentiment analysis to filter explanations into advice of what to do and warnings of what to avoid. We developed object-focused advice to represent what actions the agent should take when dealing with objects. A reinforcement learning agent used object-focused advice to learn policies that maximized its reward. After mitigating false negatives, using sentiment as a filter was approximately 85% accurate. object-focused advice performed better than when no advice was given, the agent learned where to apply the advice, and the agent could recover from adversarial advice. We also found the method of interaction should be designed to ease the cognitive load of the human teacher or the advice may be of poor quality.
机译:为了使机器人可以从没有机器学习专业知识的人那里学习,机器人应该从自然的人类指导中学习。大多数包含解释的机器学习技术都要求人们使用有限的词汇并提供状态信息,即使它不是直观的。本文讨论了一个通过解释学习了玩Mario Bros.游戏的软件代理。我们提高从解释中学习的目标是双重的:1)将解释过滤为建议和警告,以及2)从没有状态信息的句子中学习策略。我们使用情感分析来将解释过滤为对做什么的建议以及对避免的警告。我们开发了以对象为中心的建议,以表示代理在处理对象时应采取的措施。强化学习代理使用以对象为中心的建议来学习使奖励最大化的策略。减轻误报后,使用情绪作为过滤条件的准确性约为85%。以对象为中心的建议的效果要好于没有给出建议的情况,代理人了解了将建议应用于何处,并且代理人可以从对抗性建议中恢复过来。我们还发现应该设计交互方法来减轻老师的认知负担,否则建议的质量可能较差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号