A novel use of value iteration for deriving bounds for threshold and switching curve optimal policies

Ertiningsih Dwi; Bhulai Sandjai; Spieksma Flora

首页> 外文期刊>Naval Research Logistics >A novel use of value iteration for deriving bounds for threshold and switching curve optimal policies

【24h】

A novel use of value iteration for deriving bounds for threshold and switching curve optimal policies

机译：值迭代的一种新颖用法，用于得出阈值和切换曲线最优策略的界限

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this article, we develop a novel role for the initial function v(0) in the value iteration algorithm. In case the optimal policy of a countable state Markovian queueing control problem has a threshold or switching curve structure, we conjecture, that one can tune the choice of v(0) to generate monotonic sequences of n-stage threshold or switching curve optimal policies. We will show this for three queueing control models, the M/M/1 queue with admission and with service control, and the two-competing queues model with quadratic holding cost. As a consequence, we obtain increasingly tighter upper and lower bounds. After a finite number of iterations, either the optimal threshold, or the optimal switching curve values in a finite number of states is available. This procedure can be used to increase numerical efficiency.

机译：在本文中，我们为值迭代算法中的初始函数v（0）开发了一个新颖的角色。如果可数状态马尔可夫排队控制问题的最优策略具有阈值或切换曲线结构，我们推测，可以调整v（0）的选择以生成n级阈值或切换曲线最优策略的单调序列。我们将针对三种排队控制模型，带有准入和服务控制的M / M / 1队列以及具有二次持有成本的两个竞争队列模型展示这一点。结果，我们获得了越来越严格的上限和下限。经过有限次数的迭代后，可以使用有限状态下的最佳阈值或最佳开关曲线值。此过程可用于提高数值效率。

著录项

来源
《Naval Research Logistics》 |2018年第8期|638-659|共22页
作者
Ertiningsih Dwi; Bhulai Sandjai; Spieksma Flora;
展开▼
作者单位

Univ Gadjah Mada Dept Math Yogyakarta Indonesia|Leiden Univ Dept Math Leiden Netherlands;

Vrije Univ Amsterdam Dept Math Amsterdam Netherlands;

Leiden Univ Dept Math Leiden Netherlands;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
deriving bounds; optimal policies; value iteraton;

机译：推导界限;最佳政策;价值迭代;

相似文献

外文文献
中文文献
专利

1. A Comparison of Iterated Optimal Stopping and Local Policy Iteration for American Options Under Regime Switching [J] . J. Babbin, P. A. Forsyth, G. Labahn Journal of Scientific Computing . 2014,第2期

机译：体制转换下美国期权的迭代式最优停损与局部政策迭代的比较
2. Identifying the Stance of Monetary Policy at the Zero Lower Bound: A Markov-Switching Estimation Exploiting Monetary-Fiscal Policy Interdependence [J] . MANUEL GONZALEZ-ASTUDILLO Journal of money, credit and banking . 2018,第1期

机译：确定零下限货币政策的立场：利用货币-财政政策相互依赖的马尔可夫转换估计
3. Alternating server with non-zero switch-over times and opposite-queue threshold-based switching policy [J] . Jolles Amit, Perel Efrat, Yechiali Uri Performance Evaluation . 2018,第OCTa期

机译：具有非零切换时间和基于相反阈值阈值的切换策略的备用服务器
4. Deriving an optimally deceptive policy in two-player iterated games [C] . Elisabeth Paulson, Booz Allen Hamilton, Christopher Griffin American Control Conference . 2016

机译：在两人迭代游戏中得出最佳欺骗策略
5. Selection of optimal threshold and near-optimal interval using profit function and roc curve: A risk management application. [D] . Chen, Jingru. 2011

机译：使用利润函数和roc曲线选择最佳阈值和接近最佳区间：一种风险管理应用程序。
6. Optimality condition and iterative thresholding algorithm for ... formula ...-regularization problems [O] . Hongwei Jiao, Yongqiang Chen, Jingben Yin -1

机译：公式正则化问题的最优条件和迭代阈值算法
7. Identifying the Stance of Monetary Policy at the Zero Lower Bound: A Markov-Switching Estimation Exploiting Monetary-Fiscal Policy Interdependence [O] . Manuel Gonzalez-Astudillo 2014

机译：确定零下限的货币政策的立场：利用货币财政政策相互依存的马尔可夫切换估计

A novel use of value iteration for deriving bounds for threshold and switching curve optimal policies

摘要

著录项

相似文献

相关主题

期刊订阅