Semi-Markov adaptive critic heuristics with application to airline revenue management

Ketaki KULKARNI; Abhijit GOSAVI; Susan MURRAY; Katie GRANTHAM

首页> 中文期刊> 《控制理论与应用：英文版》 >Semi-Markov adaptive critic heuristics with application to airline revenue management

Semi-Markov adaptive critic heuristics with application to airline revenue management

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

The adaptive critic heuristic has been a popular algorithm in reinforcement learning(RL) and approximate dynamic programming(ADP) alike.It is one of the first RL and ADP algorithms.RL and ADP algorithms are particularly useful for solving Markov decision processes(MDPs) that suffer from the curses of dimensionality and modeling.Many real-world problems,however,tend to be semi-Markov decision processes(SMDPs) in which the time spent in each transition of the underlying Markov chains is itself a random variable.Unfortunately for the average reward case,unlike the discounted reward case,the MDP does not have an easy extension to the SMDP.Examples of SMDPs can be found in the area of supply chain management,maintenance management,and airline revenue management.In this paper,we propose an adaptive critic heuristic for the SMDP under the long-run average reward criterion.We present the convergence analysis of the algorithm which shows that under certain mild conditions,which can be ensured within a simulator,the algorithm converges to an optimal solution with probability 1.We test the algorithm extensively on a problem of airline revenue management in which the manager has to set prices for airline tickets over the booking horizon.The problem has a large scale,suffering from the curse of dimensionality,and hence it is difficult to solve it via classical methods of dynamic programming.Our numerical results are encouraging and show that the algorithm outperforms an existing heuristic used widely in the airline industry.

著录项

来源
《控制理论与应用：英文版》 |2011年第3期|421-430|共10页
作者
Ketaki KULKARNI; Abhijit GOSAVI; Susan MURRAY; Katie GRANTHAM;
展开▼
作者单位

Department of Engineering Management and Systems Engineering;

Missouri University of Science and Technology;

展开▼
原文格式 PDF
正文语种 chi
中图分类马尔可夫过程;
关键词
适应批评家; 演员批评家; Semi-Markov; 近似动态编程; 加强学习;

Semi-Markov adaptive critic heuristics with application to airline revenue management

摘要

著录项

相关主题

期刊订阅