Disclosed is a partially observable Markov decision process (POMDP)-based optimal robot path planning method. A robot searches for an optimal path to a target position. A POMDP model and an SARSOP algorithm are used as a basis. A GLS search method is used as a heuristic condition during searching. For continuous state and massive view space problems, the usage of the present invention can reduce the times for updating upper and lower bounds of the belief state in multiple similar paths which are updated repetitively by an early classical algorithm using an experiment as the heuristic condition. The final optimal policy is not affected, the algorithm efficiency is improved, and the robot can get a better policy and find a better path in the same time.
展开▼