Toward learning cooperative behavior for any number of agents, this paper proposes a multi-agent reinforcement learning method without communication, called PMRL-based Learning for Any number of Agents (PLAA). PLAA prevents from agents reaching the purpose for spending too many times, and to promote the local multi-agent cooperation without communication by PMRL as a previous method. To guarantee the effectiveness of PLAA, this paper compares PLAA with Q-learning, and two previous methods in 10 kinds of the maze for the 2 and 3 agents. From the experimental result, we revealed those things: (a) PLAA is the most effective method for cooperation among 2 and 3 agents; (b) PLAA enable the agents to cooperate with each other in small iterations.
展开▼