首页> 外文OA文献 >Evaluating reinforcement learning for game theory applicationlearning to price airline seats under competition
【2h】

Evaluating reinforcement learning for game theory applicationlearning to price airline seats under competition

机译:评估博弈论应用的强化学习学会在竞争中为航空公司席位定价

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Applied Game Theory has been criticised for not being able to model real decision making situations. A game's sensitive nature and the difficultly in determining the utility payoff functions make it hard for a decision maker to rely upon any game theoretic results. Therefore the models tend to be simple due to the complexity of solving them (i.e. finding the equilibrium).In recent years, due to the increases of computing power, different computer modelling techniques have been applied in Game Theory. A major example is Artificial Intelligence methods e.g. Genetic Algorithms, Neural Networks and Reinforcement Learning (RL). These techniques allow the modeller to incorporate Game Theory within their models (or simulation) without necessarily knowing the optimal solution. After a warm up period of repeated episodes is run, the model learns toplay the game well (though not necessarily optimally). This is a form of simulation-optimization.The objective of the research is to investigate the practical usage of RL within a simple sequential stochastic airline seat pricing game. Different forms of RL are considered and compared to the optimal policy, which is found using standard dynamic programming techniques. The airline game and RL methods displays various interesting phenomena, which are also discussed. For completeness, convergence proofs forthe RL algorithms were constructed.
机译:应用博弈论因无法建模真实的决策情况而受到批评。游戏的敏感性和难以确定效用支付功能的情况使得决策者很难依靠任何游戏理论结果。因此,由于求解的复杂性(即找到平衡),模型趋于简单。近年来,由于计算能力的提高,不同的计算机建模技术已应用于博弈论中。一个主要的例子是人工智能方法,例如遗传算法,神经网络和强化学习(RL)。这些技术使建模者无需将最佳解决方案纳入模型(或仿真)中即可将博弈论纳入其中。运行了重复剧集的预热期后,该模型将学习如何很好地玩游戏(尽管不一定是最佳玩法)。这是一种模拟优化的形式。本研究的目的是在一个简单的顺序随机航空公司座位定价游戏中研究RL的实际用法。考虑了不同形式的RL,并将其与使用标准动态编程技术找到的最佳策略进行比较。航空博弈和RL方法显示了各种有趣的现象,并对此进行了讨论。为了完整性,构建了RL算法的收敛证明。

著录项

  • 作者

    Collins Andrew;

  • 作者单位
  • 年度 2009
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号