首页> 外文学位 >High-dimensional adaptive dynamic programming with mixed integer linear programming.

【24h】

High-dimensional adaptive dynamic programming with mixed integer linear programming.

机译：具有混合整数线性规划的高维自适应动态规划。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Dynamic programming (DP, Bellman 1957) is a classic mathematical programming approach to solve multistage decision problems. The "Bellman equation" uses a recursive concept that includes both the current contribution and future contribution in the objective function of an optimization. The method has potential to represent dynamic decision-making systems, but an exact DP solution algorithm is limited to small problems with restrictions, such as problems with linear transitions and problems without uncertainty. Approximate dynamic programming (ADP) is a modern branch of DP that seeks to achieve numerical solutions via approximation. It is can be applied to real-world DP problems, but there are still challenges for high dimensions. This dissertation focuses on ADP value function approximation for a continuous-state space using the statistical perspective (Chen et al. 1999). Two directions of ADP methodology are developed: a sequential algorithm to explore the state space, and a sequential algorithm using mixed integer linear programming (MILP) and regression trees.;The first component addresses exploration of the state space. A sequential state space exploration (SSSE) algorithm (Fan 2008) was developed using neural networks. Here it is considered the use of multivariate adaptive regression splines (Friedman 1991) in place of neural networks. In ADP, the value function approximation is defined over a specified state space region. In the real world, the relevant state space region is unknown. In particular, the ADP approach employed in this dissertation uses a statistical perspective that is analogous to design and analysis of computer experiments (DACE, Chen et al. 2006). In DACE, an experimental design is used to discretize the input space region, which is the state space region in ADP. Since the ADP state space region is unknown, SSSE uses the stochastic trajectories to sample future states and identify the potential range of system state. By reducing iterations without impacting solution quality, SSSE using MARS demonstrates improved efficiency over SSSE with neural networks.;The second component of this dissertation addresses the optimization of a real world, complex, dynamic system. This work is motivated by a case study on the environmental impact of aircraft deicing activities at the Dallas-Fort Worth (D/FW) International Airport. For this case study, the state transitions are nonlinear, the objective function is nonconvex, and the decision (action) space is high-dimensional and discrete. To overcome these complexities, an ADP method is introduced using a piecewise linear value function approximation and MILP. The piecewise linear structure can be transformed for MILP by introducing binary variables. The treed regression value function approximation is flexible enough to approximate the nonlinear structure of the data and structured to be easily formulated in MILP. The proposed ADP approach is compared with a reinforcement learning (Murphy 2005) approach.

机译：动态编程（DP，Bellman 1957）是解决多阶段决策问题的经典数学编程方法。 “贝尔曼方程式”使用递归概念，该递归概念包括优化目标函数中的当前贡献和将来贡献。该方法具有表示动态决策系统的潜力，但是精确的DP解算法仅限于有限制的小问题，例如线性过渡问题和没有不确定性的问题。近似动态编程（ADP）是DP的现代化分支，旨在通过逼近来实现数值解。它可以应用于现实世界中的DP问题，但是对于高尺寸仍然存在挑战。本文从统计学的角度着眼于连续状态空间的ADP值函数逼近（Chen et al。1999）。 ADP方法论的两个方向是：探索状态空间的顺序算法，以及使用混合整数线性规划（MILP）和回归树的顺序算法。第一个组件着眼于状态空间的探索。使用神经网络开发了一种顺序状态空间探索（SSSE）算法（Fan 2008）。这里考虑使用多元自适应回归样条（Friedman 1991）代替神经网络。在ADP中，值函数近似值是在指定的状态空间区域上定义的。在现实世界中，相关的状态空间区域是未知的。特别是，本文采用的ADP方法采用的统计观点类似于计算机实验的设计和分析（DACE，Chen等，2006）。在DACE中，使用实验设计来离散化输入空间区域，该区域是ADP中的状态空间区域。由于ADP状态空间区域未知，因此SSSE使用随机轨迹来采样未来状态并确定系统状态的潜在范围。通过减少迭代而不影响解决方案质量，使用MARS的SSSE证明了比使用神经网络的SSSE更高的效率。本文的第二部分解决了现实世界中复杂，动态系统的优化问题。这项工作是通过对达拉斯-沃思堡（D / FW）国际机场飞机除冰活动对环境的影响进行的案例研究而激发的。对于此案例研究，状态转换是非线性的，目标函数是非凸的，决策（动作）空间是高维且离散的。为了克服这些复杂性，使用分段线性值函数逼近和MILP引入了ADP方法。通过引入二进制变量，可以为MILP变换分段线性结构。树状回归值函数逼近足够灵活，可以逼近数据的非线性结构，并且可以很容易地用MILP表示。将拟议的ADP方法与强化学习（Murphy 2005）方法进行了比较。

著录项

作者
Zhang, Zirun.;
展开▼
作者单位

The University of Texas at Arlington.;

展开▼
授予单位 The University of Texas at Arlington.;
学科 Operations research.;Industrial engineering.
学位 Ph.D.
年度 2015
页码 129 p.
总页数 129
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Solving mixed-integer nonlinear programmes using adaptively refined mixed-integer linear programmes [J] . Optimization methods & software . 2020,第1a3期

机译：使用自适应地精制混合整数线性程序求解混合整数非线性程序
2. Scheduling forest core area production using mixed integer programming. [J] . Wei Y, Hoganson H M Canadian Journal of Forest Research . 2007,第10期

机译：使用混合整数编程来调度林核心区域生产。
3. Whole-farm evaluation of no-till profitability in rice production using mixed integer programming. [J] . Watkins K B, Hill J L, Anders M M, Journal of Agricultural and Applied Economics . 2006,第3期

机译：使用混合整数规划对水稻生产中的免耕利润进行全农场评估。
4. mCHP Optimization by Dynamic Programming and Mixed Integer Linear Programming [C] . Faille D., Mondon C., Al-Nasrawi B., International Conference on Intelligent Systems Applications to Power Systems . 2007

机译：MCHP通过动态编程和混合整数线性编程优化
5. Mixed integer programming approaches for nonlinear and stochastic programming. [D] . Vielma Centeno, Juan Pablo. 2009

机译：非线性和随机规划的混合整数规划方法。
6. FUNCTIONAL EQUATIONS IN THE THEORY OF DYNAMIC PROGRAMMING. V. POSITIVITY AND QUASI-LINEARITY [O] . Richard Bellman 1955

机译：动态规划理论中的功能方程。五正性和准线性
7. Learning oncogenetic networks by reducing to mixed integer linear programming. [O] . Hossein Shahrabi Farahani, Jens Lagergren 2013

机译：通过简化为混合整数线性规划来学习成核网络。
8. Optimal Five-Year Planning Using Mixed-Integer Linear Programming. Three Models Implemented for Naval Air Test Center. [R] . Mavrikas, C. E. 1979

机译：混合整数线性规划的最优五年规划。海军航空试验中心实施的三种模式。

High-dimensional adaptive dynamic programming with mixed integer linear programming.

摘要

著录项

相似文献

相关主题

期刊订阅