首页> 外文期刊>ACM Transactions on Modeling and Computer Simulation >Combining Importance Sampling and Temporal Difference Control Variates To Simulate Markov Chains
【24h】

Combining Importance Sampling and Temporal Difference Control Variates To Simulate Markov Chains

机译:结合重要性采样和时间差异控制变量来模拟马尔可夫链

获取原文
获取原文并翻译 | 示例

摘要

It is well known that in estimating performance measures associated with a stochastic system a good importance sampling distribution (IS) can give orders of magnitude of variance reduction while a bad one/may lead to large, even infinite, variance. In this paper we study how this sensitivity of the estimator variance to the importance sampling change of measure may be "dampened" by combining importarice sampling with stochastic approximation based temporal difference (TD) method. We consider ]a finite state space discrete time Markov chain (DTMC) with one-step transition rewards and an absorbing set of states and focus on estimating the cumulative expected reward to absorption starting from any state. In this setting we develop sufficient conditions under which the estimate resulting from the combined approach has a mean square error that asymptotically equals zero even when the estimate formed by using only importance sampling change of measure has infinite variance. In particular, we consider the problem of estimating the small buffer overflow probability in a queuing network, where the change of measure suggested in literature is shown to have infinite variance under certain parameters and where the appropriate combination of IS and TD method can be empirically seen to have a much faster convergence rate compared to naive simulation.
机译:众所周知,在评估与随机系统相关的性能指标时,良好的重要性抽样分布(IS)可以使方差降低几个数量级,而不良的一个可能会导致较大甚至无限的方差。在本文中,我们研究了如何通过将重要性抽样与基于随机逼近的时差(TD)方法相结合来“减弱”估算器方差对度量的重要度变化的敏感性。我们考虑一个具有一步过渡奖励和一组吸收状态的有限状态空间离散时间马尔可夫链(DTMC),并着重于估计从任何状态开始吸收的累积预期奖励。在这种情况下,我们开发了充分的条件,在这种条件下,即使仅使用度量的重要度采样变化形成的估计具有无限方差,组合方法得出的估计也具有渐近等于零的均方误差。特别是,我们考虑了估计排队网络中较小的缓冲区溢出概率的问题,其中文献中建议的量度变化显示在某些参数下具有无限方差,并且可以凭经验看到IS和TD方法的适当组合与朴素的仿真相比,具有更快的收敛速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号