We have previously (1996) proposed fuzzy interpolation-based Q-learning where fuzzy rules are used to represent Q-function (action utility function), in order to enable us to treat continuous-valued states and actions. In this paper, we will introduce the idea of profit sharing plan (PSP) used in classifier systems into the fuzzy interpolation-based Q-learning in order to accelerate the speed of learning and will discuss its effectiveness through applications to control problems such as cart-pole balancing problems.
展开▼