A reinforcement learning-based method is provided that enables efficient communication for networks having varying numbers and topologies of mobile and stationary nodes. The method provides an autonomous, optimized, routing method that may be implemented in a distributed manner among the nodes that allows the nodes to make intelligent decisions of how to forward data from a source node to a destination node with little or no a priori information about the network. The method involves receiving, at a node within a distributed network, data packets containing position and velocity information from a transmitting node. Position and velocity estimates are determined for the transmitting and receiving nodes using the position and velocity information. State-action pair value estimates are determined in the destination direction for forward packets and the source direction for backward sweeping packets, along with associated destination direction and source direction state value estimates, which determine packet transmittal.
展开▼