首页> 外文学位 >Storage system management using reinforcement learning techniques and nonlinear models.
【24h】

Storage system management using reinforcement learning techniques and nonlinear models.

机译:使用强化学习技术和非线性模型进行存储系统管理。

获取原文
获取原文并翻译 | 示例

摘要

In this thesis, modeling and optimization in the field of storage management under stochastic condition will be investigated using two different methodologies: Simulation Optimization Techniques (SOT), which are usually categorized in the area of Reinforcement Learning (RL), and Nonlinear Modeling Techniques (NMT).;Furthermore, the new nonlinear modeling for reservoir management using indicator functions and randomized policy introduced by Fletcher and Ponnambalam, is extended to stochastic releases in multi-reservoir systems. In this extension, two different approaches for defining the release policies are proposed. In addition, the main restriction of considering the normal distribution for inflow is relaxed by using a beta-equivalent general distribution. A five-reservoir case study from India is used to demonstrate the benefits of these new developments. Using a warehouse management problem as an example, application of the proposed method to other storage management problems is outlined.;For the first set of methods, simulation plays a fundamental role in evaluating the control policy: learning techniques are used to deliver sub-optimal policies at the end of a learning process. These iterative methods use the interaction of agents with the stochastic environment through taking actions and observing different states. To converge to the steady-state condition where policies and value functions do not change significantly with the continuation of the learning process, all or most important states must be visited sufficiently. This might be prohibitively time-consuming for large-scale problems. To make these techniques more efficient both in terms of computation time and robust optimal policies, the idea of Opposition-Based Learning (OBL-Type I and Type II) is employed to modify/extend popular RL techniques including Q-Learning, Q(lambda), sarsa, and sarsa (lambda). Several new algorithms are developed using this idea. It is also illustrated that, function approximation techniques such as neural networks can contribute to the process of learning. The state-of-the-art implementations usually consider the maximization of expected value of accumulated reward. Extending these techniques to consider risk and solving some well-known control problems are important contributions of this thesis.
机译:本文将使用两种不同的方法研究随机条件下存储管理领域的建模和优化:仿真优化技术(SOT)和非线性建模技术(SOT)通常归类于强化学习(RL)领域。此外,Fletcher和Ponnambalam引入了使用指标函数和随机策略进行储层管理的新非线性模型,扩展到多储层系统中的随机释放。在此扩展中,提出了两种用于定义发布策略的方法。另外,通过使用β当量的一般分布,可以放宽考虑流入量的正态分布的主要限制。来自印度的一个五水库案例研究被用来证明这些新发展的好处。以仓库管理问题为例,概述了该方法在其他存储管理问题中的应用。对于第一组方法,模拟在评估控制策略方面起着根本性的作用:学习技术用于提供次优的解决方案学习过程结束时的政策。这些迭代方法通过采取行动并观察不同状态来利用代理与随机环境的交互。为了收敛到稳态状态,在这种状态下,策略和价值功能不会随着学习过程的继续而发生显着变化,必须充分访问所有或最重要的状态。对于大规模问题,这可能会非常耗时。为了使这些技术在计算时间和鲁棒的最佳策略上都更加有效,采用了基于对立的学习(OBL类型I和类型II)的思想,以修改/扩展包括Q学习,Q(lambda)在内的流行RL技术。 ),sarsa和sarsa(lambda)。使用这种思想开发了几种新算法。还说明了诸如神经网络之类的函数逼近技术可以促进学习过程。最新的实现通常考虑累积奖励的期望值的最大化。扩展这些技术以考虑风险并解决一些众所周知的控制问题是本论文的重要贡献。

著录项

  • 作者

    Mahootchi, Masoud.;

  • 作者单位

    University of Waterloo (Canada).;

  • 授予单位 University of Waterloo (Canada).;
  • 学科 Engineering System Science.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 153 p.
  • 总页数 153
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号