We consider the problem belief-state monitoring for the purposes of implementing a policy for a partially-observable Markov decision process (POMDP), specifically how one might approxi-mate the belief state. Other schemes for belief-state approximation (e.g., based on minimizing a measure such as KL-divergence between the true and estimated state) are not necessarily is deter-mined by the expected error in utility rather than by the error in the belief state itself. We propose heuristic methods for finding good projection schemes for belief state estimation-exhibiting anytime characteristics-given a POMDP value function. We also describe several algorithms for constructing bounds on the error in decision qual-ity (expected utility) associated with acting in ac-cordance with a given belief state approximation.
展开▼