首页> 美国政府科技报告 >Explorations of the Practical Issues of Learning Prediction-Control Tasks Using Temporal Difference Learning Methods.
【24h】

Explorations of the Practical Issues of Learning Prediction-Control Tasks Using Temporal Difference Learning Methods.

机译:利用时间差异学习方法探索学习预测控制任务的实际问题。

获取原文

摘要

There has been recent interest in using a class of incremental learning algorithms called temporal difference learning methods to attack problems of prediction. These algorithms have been brought to bear on various prediction problems in the past, but have remained poorly understood. It is the purpose of this thesis to further explore this class of algorithms, particularly the TD (lambda) algorithm. A number of practical issues are raised and discussed from a general theoretical perspective and then explored in the context of several case studies. the thesis presents a framework for viewing these algorithms independent of the particular task at hand and uses this framework to explore not only tasks of prediction, but also prediction tasks that require control, whether complete or partial. This includes applying the TD (Lambda) algorithm to two tasks: (1) learning to play tic-tac-toe from the outcome of self-play and the outcome of play against a perfectly-playing opponent and (2) learning two simple one-dimensional image segmentation tasks.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号