...
首页> 外文期刊>Expert Systems with Application >Multiobjective Markov chains optimization problem with strong Pareto frontier: Principles of decision making
【24h】

Multiobjective Markov chains optimization problem with strong Pareto frontier: Principles of decision making

机译:具有强帕累托边界的多目标马尔可夫链优化问题:决策原理

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we present a novel approach for computing the Pareto frontier in Multi-Objective Markov Chains Problems (MOMCPs) that integrates a regularized penalty method for poly-linear functions. In addition, we present a method that make the Pareto frontier more useful as decision support system: it selects the ideal multi-objective option given certain bounds. We restrict our problem to a class of finite, ergodic and controllable Markov chains. The regularized penalty approach is based on the Tikhonov's regularization method and it employs a projection-gradient approach to find the strong Pareto policies along the Pareto frontier. Different from previous regularized methods, where the regularizator parameter needs to be large enough and modify (some times significantly) the initial functional, our approach balanced the value of the functional using a penalization term (.t) and the regularizator parameter (3) at the same time improving the computation of the strong Pareto policies. The idea is to optimize the parameters Ic and S such that the functional conserves the original shape. We set the initial value and then decrease it until each policy approximate to the strong Pareto policy. In this sense, we define exactly how the parameters pc and S tend to zero and we prove the convergence of the gradient regularized penalty algorithm. On the other hand, our policy-gradient multi-objective algorithms exploit a gradient-based approach so that the corresponding image in the objective space gets a Pareto frontier of just strong Pareto policies. We experimentally validate the method presenting a numerical example of a real alternative solution of the vehicle routing planning problem to increase security in transportation of cash and valuables. The decision-making process explored in this work correspond to the most frequent computational intelligent models applied in practice within the Artificial Intelligence research area. (C) 2016 Elsevier Ltd. All rights reserved.
机译:在本文中,我们提出了一种新的方法来计算多目标马尔可夫链问题(MOMCPs)中的帕累托边界,该方法集成了针对多线性函数的正则惩罚方法。另外,我们提出了一种使帕累托边界作为决策支持系统更有用的方法:在给定一定范围的情况下选择理想的多目标选项。我们将问题限制为一类有限的,遍历可控的马尔可夫链。正则化惩罚方法基于Tikhonov的正则化方法,它采用投影梯度方法来沿着帕累托边界找到强大的帕累托政策。与以前的正则化方法不同,在此方法中,正则化参数必须足够大并修改(有时会显着地)初始功能,我们的方法使用惩罚项(.t)和正则化参数(3)平衡功能的值。同时改善强大的帕累托政策的计算。想法是优化参数Ic和S,以使功能保留原始形状。我们设置初始值,然后减小它,直到每个策略都近似于强壮的帕累托策略。在这个意义上,我们精确定义了参数pc和S如何趋于零,并证明了梯度正则化罚算法的收敛性。另一方面,我们的策略梯度多目标算法采用基于梯度的方法,因此目标空间中的对应图像将获得仅具有强大Pareto策略的Pareto边界。我们通过实验验证了该方法,该方法提供了车辆路径规划问题的实际替代解决方案的数值示例,以提高现金和贵重物品的运输安全性。在这项工作中探索的决策过程与人工智能研究领域中实践中最常用的计算智能模型相对应。 (C)2016 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号