首页> 外文会议>Annual Conference on Neural Information Processing Systems >Temporal Abstraction in Temporal-difference Networks
【24h】

Temporal Abstraction in Temporal-difference Networks

机译:时间差网络中的时间抽象

获取原文

摘要

We present a generalization of temporal-difference networks to include temporally abstract options on the links of the question network. Temporal-difference (TD) networks have been proposed as a way of representing and learning a wide variety of predictions about the interaction between an agent and its environment. These predictions are compositional in that their targets are defined in terms of other predictions, and subjunctive in that that they are about what would happen if an action or sequence of actions were taken. In conventional TD networks, the inter-related predictions are at successive time steps and contingent on a single action; here we generalize them to accommodate extended time intervals and contingency on whole ways of behaving. Our generalization is based on the options framework for temporal abstraction. The primary contribution of this paper is to introduce a new algorithm for intra-option learning in TD networks with function approximation and eligibility traces. We present empirical examples of our algorithm's effectiveness and of the greater representational expressiveness of temporally-abstract TD networks.
机译:我们展示了时间差网络的概括,以包括问题网络链接的时间抽象选项。已经提出了时间差(TD)网络作为代表和学习关于代理和环境之间的相互作用的各种预测的方式。这些预测是组成的,因为它们的目标是在其他预测方面定义,并且对它们是关于采取行动或行动序列会发生的情况。在传统的TD网络中,与相关的相互关系预测是在连续的时间步骤中,并且在一个动作上偶然;在这里,我们概括了他们在整个行为方式上适应延长的时间间隔和应急。我们的概括基于时间抽象的选项框架。本文的主要贡献是在具有函数近似和资格迹线的TD网络中引入新的intra-Option学习算法。我们呈现了我们算法的效力和临时抽象TD网络的更大代表性效果的实证例子。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号