Between supervise and unsupervised learning, connectionism proposes a qualitative learning or reinforcement learning which is of interest for applications needing qualitative control. This learning technique is not new and the following advantages of a neural implementation of reinforcemtn learning are identified: a small memory requirement and a more effective exploration of the situations-actions space. However, the restriction of the applicability of this algorithm to problems with a limited number of actions (usually two) remains. We propose to solve this problem by correctly specifying the output coding of the action of the output cell layer of the neural network and interpreting the output values as a certainty value for doing a specified action. Experiments performed in the real world with the miniature robot Khepera confirm the possibility of extending the applicability of reinforcement learning to cases where multiple actions are possible for each situation.
展开▼