Given a discrete-time stochastic control systemxt+1=F(xt,at,ξt),t=0, 1,.., N (N≤∞), where the noise process {ξt} is a sequence of i.i.d. random elements with distributionμ, letυμN(x) be the optimal reward function when the initial state isxand the planning horizon isN.We give conditions under whichvμNis a continuous function inμfor several reward criteria. The applicability of these results to nonparametric adaptive control of stochastic systems is briefly
展开▼