An agent acting in an environment aims to minimise uncertainties so that being attacked can be predicted, and rewards are not only found by chance. These events define an error signal which can be used to improve performance. In this paper we present a new algorithm where an error signal from a reflex trains a novel deep network: the error is propagated forwards through the network from its input to its output, in order to generate pro-active actions. We demonstrate the algorithm in two scenarios: a lst-person shooter game and a driving car scenario, where in both cases the network develops strategies to become pro-active.
展开▼