首页> 外文期刊>IEICE Transactions on Information and Systems >Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation
【24h】

Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation

机译:最小绝对策略迭代-价值函数逼近的稳健方法

获取原文
获取原文并翻译 | 示例
       

摘要

Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers in observed rewards. In this paper, we propose an alternative method that employs the absolute loss for enhancing robustness and reliability. The proposed method is formulated as a linear programming problem which can be solved efficiently by standard optimization software, so the computational advantage is not sacrificed for gaining robustness and reliability. We demonstrate the usefulness of the proposed approach through a simulated robot-control task.
机译:最小二乘策略迭代由于其计算效率高而在机器人技术中是一种有用的强化学习方法。但是,它倾向于对观察到的奖励中的异常值敏感。在本文中,我们提出了一种替代方法,该方法采用绝对损耗来增强鲁棒性和可靠性。提出的方法被公式化为一个线性规划问题,可以通过标准优化软件有效地解决,因此不会牺牲计算优势来获得鲁棒性和可靠性。我们通过模拟机器人控制任务证明了所提出方法的有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号