An important stage in traffic planning is traffic assignment, which seeks to reproduce the way drivers select their routes. It assumes that each driver is aware of a number of routes to travel from an origin to a destination, that it performs some experimentation, and that it selects rationally the route with the highest utility. This is the basis for many approaches that, in an iterative way, vary the combination of route choices in order to find one that maximizes the utility. This perspective is therefore a centralized, aggregate one. In reality, though, drivers may perform en-route experimentation, i.e., they deviate from the originally planned route. Thus, in this paper, individual drivers are considered as active and autonomous agents, which, instead of having a central entity assigning complete trips to each agent, build these trips by experimentation during the actual trip. Agents learn their routes by deciding, at each node, how to continue their trips to each one's destination, in a way to minimize their travel times. Because the choice of one agent does impact several others, this is a non-cooperative multiagent learning problem (thus stochastic), which is known for being much more challenging than single agent reinforcement learning. To illustrate this approach, results from two non-trivial networks are presented, which have thousands of learning agents, clearly configuring a hard learning problem. Results are compared to iterative, centralized methods. It is concluded that an agent-based perspective yields choices that are more aligned with the real-world situation because (i) trips are computed by the agent itself (and not provided to the agent by any central entity), and (ii) it is not based on pre-computed paths (rather, it is built during the trip itself).
展开▼