Inverse reinforcement learning (IRL), analogously to RL, refers to both the problem and associated methods by which an agent passively observing another agent's actions over time, seeks to learn the latter's reward function. The learning agent is typically called the learner while the observed agent is often an expert in popular applications such as in learning from demonstrations. Some of the assumptions that underlie current IRL methods are impractical for many robotic applications. Specifically, they assume that the learner has full observability of the expert as it performs its task; that the learner has full knowledge of the expert's dynamics; and that there is always only one expert agent in the environment. For example, these assumptions are particularly restrictive in our application scenario where a subject robot is tasked with penetrating a perimeter patrol by two other robots after observing them from a vantage point. In our instance of this problem, the learner can observe at most 10% of the patrol.
展开▼