首页> 外文OA文献 >Towards Feature Selection In Actor-Critic Algorithms
【2h】

Towards Feature Selection In Actor-Critic Algorithms

机译:演员批评算法中的特征选择

摘要

Choosing features for the critic in actor-critic algorithms with function approximation is known to be a challenge. Too few critic features can lead to degeneracy of the actor gradient, and too many features may lead to slower convergence of the learner. In this paper, we show that a well-studied class of actor policies satisfy the known requirements for convergence when the actor features are selected carefully. We demonstrate that two popular representations for value methods - the barycentric interpolators and the graph Laplacian proto-value functions - can be used to represent the actor in order to satisfy these conditions. A consequence of this work is a generalization of the proto-value function methods to the continuous action actor-critic domain. Finally, we analyze the performance of this approach using a simulation of a torque-limited inverted pendulum.
机译:在具有功能逼近的演员评论算法中为评论者选择特征是一项挑战。评论家功能太少会导致演员梯度的退化,而功能过多会导致学习者的收敛速度变慢。在本文中,我们表明,当精心选择参与者特征时,经过充分研究的参与者策略类别可以满足已知的收敛要求。我们证明了值方法的两种流行表示形式-重心插值器和图拉普拉斯原型值函数-可以用来表示参与者,以满足这些条件。这项工作的结果是将原型价值函数方法推广到连续作用参与者批评领域。最后,我们使用有限扭矩倒立摆的仿真分析了这种方法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号