机译:EXPLORATORY HJB EQUATIONS AND THEIR CONVERGENCE
Department of Industrial Engineering and Operations Research, Columbia University, New York, NY USA 10027;
Department of Mathematics, University of California, San Diego, CA USA;
HJB equations; stochastic control; partial differential equations; reinforcement learning; exploratory control; entropy regularization; simulated annealing; overdamped Langevin equation;