首页> 外国专利> METHOD AND APPARATUS FOR CONSTRUCTING INFORMATIVE OUTCOMES TO GUIDE MULTI-POLICY DECISION MAKING

METHOD AND APPARATUS FOR CONSTRUCTING INFORMATIVE OUTCOMES TO GUIDE MULTI-POLICY DECISION MAKING

机译:构建信息成果以指导多政策决策的方法和装置

摘要

In Multi-Policy Decision-Making (MPDM), many computationally-expensive forward simulations are performed in order to predict the performance of a set of candidate policies. In risk-aware formulations of MPDM, only the worst outcomes affect the decision making process, and efficiently finding these influential outcomes becomes the core challenge. Recently, stochastic gradient optimization algorithms, using a heuristic function, were shown to be significantly superior to random sampling. In this disclosure, it was shown that accurate gradients can be computed-even through a complex forward simulation - using approaches similar to those in dep networks. The proposed approach finds influential outcomes more reliably, and is faster than earlier methods, allowing one to evaluate more policies while simultaneously eliminating the need to design an easily-differentiable heuristic function.
机译:在多策略决策(MPDM)中,为了预测一组候选策略的性能,执行了许多计算昂贵的前向仿真。在MPDM的风险感知公式中,只有最差的结果会影响决策过程,而有效地找到这些有影响的结果成为核心挑战。最近,使用启发式函数的随机梯度优化算法显示出明显优于随机采样。在本公开中,示出了即使使用复杂的前向仿真,也可以使用类似于Dep网络中的方法来计算准确的梯度。所提出的方法比以前的方法更可靠地发现有影响的结果,并且速度更快,从而使人们可以评估更多策略,同时消除了设计易于区分的启发式函数的需要。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号