Using reinforcement learning to improve network durability.

机译：使用强化学习来提高网络持久性。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Our goal is to determine and optimize the efficacy of reinforcing an existing flow network to prevent unmet demand from imminent disruptions. We are given probabilities of failures for edges in the network and are asked to find edges which will best provide durability to the network post-event. The problem is extended to multiple time steps to address concerns of available resources versus quality of installations: the farther away from the event one makes decisions the more resources are available but the less reliable the uncertainty information. This sequential decision-making process is a classic example of dynamic programming. To avoid the "curses of dimensionality", we formulate an approximate dynamic program. To improve performance, especially as applied to flow networks, we derive several innovative adaptations from reinforcement learning concepts. This involves developing a policy, a function that makes installation decisions when given current forecast information, in a two step process: policy evaluation and policy improvement.;The primary solution technique takes forecast samples from a Monte Carlo simulation in the style of stochastic programming. Once a forecast is obtained, the problem is set up by taking additional samples of the forecast probabilities to determine capacities for the given time step. This forms the state information used in performing the approximate dynamic program. The sampled outcome information is used to define network constraints for the policy improvement step. The approximation for future costs is then refined using the improved policy compared with a desirable target objective. This process is repeated over several iterations. Lastly, we provide empirical evidence which corroborates with basic theorems of convergence for more simplistic forms of the reinforcement learning process.;With a trained policy, we compare its performance against traditional two-stage stochastic programs with recourse utilizing a sample average approximation model. We consider several implementations of the stochastic problem to gauge performance in a variety of ways. The material presented here is developed in the context of preparing urban infrastructures against damages caused by disasters, however is applicable to any flow network. This paper contributes to both the field of multistage stochastic programming and approximate dynamic programming by introducing factors to each other. We also apply innovative reinforcement learning techniques to flow networks that, as of this writing, have yet to be addressed.

机译：我们的目标是确定并优化增强现有流量网络的效率，以防止未满足的需求因即将到来的中断而受到影响。我们给出了网络边缘故障的可能性，并要求我们找到最能为事件后网络提供持久性的边缘。该问题扩展到多个时间步骤，以解决对可用资源与安装质量之间的关系的担忧：离事件越远，决策就越有更多的可用资源，但不确定性信息的可靠性越低。这种顺序决策过程是动态编程的经典示例。为了避免“维数的诅咒”，我们制定了一个近似的动态程序。为了提高性能，尤其是应用于流动网络的性能，我们从强化学习概念中获得了一些创新的改编。这涉及到制定策略，该函数在两步过程中提供给定当前的预测信息，从而在安装决策时做出安装决策：策略评估和策略改进。主要解决方案技术采用随机编程的方式从蒙特卡洛模拟中获取预测样本。一旦获得了预测，就可以通过对预测概率进行其他采样来确定给定时间步长的能力来设置问题。这形成了用于执行近似动态程序的状态信息。采样的结果信息用于为策略改进步骤定义网络约束。然后，与预期的目标相比，使用改进的策略来细化未来成本的近似值。这个过程重复了几次迭代。最后，我们提供了经验性证据，以证明强化学习过程的形式更为简单。该准则与收敛的基本定理相符。通过训练有素的策略，我们使用样本平均逼近模型将其与传统的两阶段随机程序的性能进行比较。我们考虑随机问题的几种实现方式，以各种方式评估性能。这里介绍的材料是在准备城市基础设施以应对灾害造成的损害的背景下开发的，但是适用于任何流量网络。本文通过相互介绍因素，为多阶段随机规划和近似动态规划领域做出了贡献。在撰写本文时，我们还将创新的强化学习技术应用于流网络，目前尚待解决。

著录项

作者
Hammel, Erik.;
展开▼
作者单位

Rensselaer Polytechnic Institute.;

展开▼
授予单位 Rensselaer Polytechnic Institute.;
学科 Applied Mathematics.;Artificial Intelligence.;Operations Research.
学位 Ph.D.
年度 2013
页码 147 p.
总页数 147
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Improving primary frequency response in networked microgrid operations using multilayer perceptron-driven reinforcement learning [J] . Radhakrishnan Nikitha, Chakraborty Indrasis, Xie Jing, IET Smart Grid . 2020,第4期

机译：利用多层的感知驱动增强学习改善网络微电网运行中的初级频率响应
2. Improving the Congestion Control Performance for Mobile Networks in High-Speed Railway via Deep Reinforcement Learning [J] . Laizhong Cui, Zuxian Yuan, Zhongxing Ming, IEEE Transactions on Vehicular Technology . 2020,第6期

机译：通过深度加固学习提高高速铁路移动网络拥塞控制性能
3. SmartFCT: Improving power-efficiency for data center networks with deep reinforcement learning [J] . Sun Penghao, Guo Zehua, Liu Sen, Computer networks . 2020,第Octa9期

机译：SMARTFCT：提高具有深度增强学习的数据中心网络的功率效率
4. Task-specific pre-learning to improve the convergence of reinforcement learning based on a deep neural network [C] . Yuan Yang, Xiaoan Li, Lu Zhang World Congress on Intelligent Control and Automation . 2016

机译：基于深度神经网络的特定于任务的预学习可提高强化学习的收敛性
5. Improving Learning and Reducing Time: A Constrained Action Based Reinforcement Learning Approach [D] . Shen, Shitian. 2019

机译：改善学习和减少时间：基于约束的加强学习方法
6. Unsupervised Learning and Clustered Connectivity Enhance Reinforcement Learning in Spiking Neural Networks [O] . Philipp Weidel, Renato Duarte, Abigail Morrison 2021

机译：无监督的学习和集群连接在尖峰神经网络中加强钢筋学习
7. How network monitoring and reinforcement learning can improve tcp fairness in wireless multi-hop networks [O] . 2016

机译：网络监控和强化学习如何提高无线多跳网络中的TCP公平性

Using reinforcement learning to improve network durability.

摘要

著录项

相似文献

相关主题

期刊订阅