A common assumption for the study of reinforcement learning of coordination is that agents can observe each other's actions (so-called joint-action learning). We present in this paper a number of simple joint-action learning algorithms and show that they perform very well when compared against more complex approaches such as OAL (Wang and Sandholm, 2002), while still maintaining convergence guarantees. Based on the empirical results, we argue that these simple algorithms should be used as baselines for any future research on joint-action learning of coordination.
展开▼