首页> 美国政府科技报告 >Coaching: Learning and Using Environment and Agent Models for Advice
【24h】

Coaching: Learning and Using Environment and Agent Models for Advice

机译:辅导:学习和使用环境和代理模型的建议

获取原文

摘要

Coaching is a relationship in which one agent provides advice to another about how to act. This thesis explores a range of problems faced by an automated coach agent in providing advice to one or more automated advice- receiving agents. The coach's job is to help the agents perform as well as possible in their environment. The author identifies and addresses a set of technical challenges: How can the coach learn and use models of the environment; How should advice be adapted to the peculiarities of the advice receivers; How can opponents be modeled, and how can those models be used; and How should advice be represented to be effectively used by a team. The thesis is inspired by a simulated robot soccer environment with a coach agent who can provide advice to a team in a standard language. The author developed, in collaboration with others, this coach environment and standard language. The experiments in the thesis represent the largest known empirical study in the simulated robot soccer environment. A predator-prey domain and a moving maze environment are used for additional experimentation. All algorithms are implemented in at least one of these environments. In addition to the coach problem formulation and decompositions, the thesis makes several technical contributions: (1) several opponent model representations with associated learning algorithms, whose effectiveness in the robot soccer domain is demonstrated; (2) a study of the effects and need for coach learning under various limitations of the advice receiver and communication bandwidth; (3) the Multi-Agent Simple Temporal Network, a multi-agent plan representation that is the refinement of a Simple Temporal Network, with an associated distributed plan execution algorithm; and (4) algorithms for learning an abstract Markov Decision Process from external observations, a given state abstraction, and partial abstract action templates. The use of the learned MDP for advice is explored in various scenarios.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号