机译:具有多个目标的区间马尔可夫决策过程:从鲁棒策略到帕累托曲线
Queens Univ Belfast Sch Elect Elect Engn & Comp Sci Belfast Antrim North Ireland|Chinese Acad Sci State Key Lab Comp Sci Inst Software Beijing Peoples R China;
Audi AG Dept Informat Technol Ingolstadt Germany;
Saarland Univ Saarland Informat Campus Saarbrucken Germany|Inst Intelligent Software Guangzhou Guangdong Peoples R China;
Univ Colorado Dept Smead Aerosp Engn & Sci Boulder CO 80309 USA;
Chinese Acad Sci State Key Lab Comp Sci Inst Software Beijing Peoples R China|Inst Intelligent Software Guangzhou Guangdong Peoples R China;
Interval Markov decision processes; multi-objective optimisation; robust synthesis; Pareto curves; complexity;
机译:统一马尔可夫决策过程中多个均值支付目标的两种观点
机译:具有多个长期平均目标的马尔可夫决策过程
机译:具有总奖励标准的多目标非原子马尔可夫决策过程
机译:区间马尔可夫决策过程的多目标鲁棒策略综合
机译:不确定条件下鲁棒机器人团队学习的并行马尔可夫决策过程。
机译:通过内在动机的自我博弈在多目标马尔可夫决策过程中发展稳健的政策覆盖范围
机译:区间马尔可夫决策过程的多目标鲁棒策略综合