Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

机译：通过核化最小二乘策略迭代对传感器-执行器系统进行智能控制

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI). Random Projections are a fast, non-adaptive dimensionality reduction framework in which high-dimensionality data is projected onto a random lower-dimension subspace via spherically random rotation and coordination sampling. KLSPI introduce kernel trick into the LSPI framework for Reinforcement Learning, often achieving faster convergence and providing automatic feature selection via various kernel sparsification approaches. In this approach, policies are computed in a low-dimensional subspace generated by projecting the high-dimensional features onto a set of random basis. We first show how Random Projections constitute an efficient sparsification technique and how our method often converges faster than regular LSPI, while at lower computational costs. Theoretical foundation underlying this approach is a fast approximation of Singular Value Decomposition (SVD). Finally, simulation results are exhibited on benchmark MDP domains, which confirm gains both in computation time and in performance in large feature spaces.

机译：在本文中，通过结合非自适应数据无关的随机投影和非参数核化最小二乘策略迭代，提出了一个新的框架，称为压缩核强化学习（CKRL），用于计算具有不确定性的顺序决策中的近最优策略。 KLSPI）。随机投影是一种快速的，非自适应的降维框架，其中，高维数据通过球形随机旋转和协调采样投影到随机的低维子空间上。 KLSPI将内核技巧引入了用于增强学习的LSPI框架，通常可实现更快的收敛，并通过各种内核稀疏化方法提供自动功能选择。在这种方法中，在通过将高维特征投影到一组随机基础上生成的低维子空间中计算策略。我们首先显示随机投影如何构成有效的稀疏化技术，以及我们的方法通常比常规LSPI收敛更快，同时计算成本更低。这种方法的理论基础是奇异值分解（SVD）的快速近似。最后，在基准MDP域上展示了仿真结果，这些结果证实了在较大特征空间中计算时间和性能方面的收益。

著录项

期刊名称 Sensors (Basel Switzerland)
作者
Bo Liu; Sanfeng Chen; Shuai Li; Yongsheng Liang;
展开▼
作者单位

展开▼
年(卷),期 2012(12),3
年度 2012
页码 2632–2653
总页数 22
原文格式 PDF
正文语种
中图分类
关键词
Markov Decision Process sensor-actuator systems random Projections Kernelized Least Square Policy Iteration;

机译：马尔可夫决策过程;传感器执行器系统;随机投影;核化最小二乘策略迭代;

相似文献

外文文献
中文文献
专利

1. Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration [J] . Bo Liu, Sanfeng Chen, Shuai Li, Sensors . 2012,第3期

机译：通过核化最小二乘策略迭代对传感器-执行器系统进行智能控制
2. Potential-Based Least-Squares Policy Iteration for a Parameterized Feedback Control System [J] . Cheng Kang, Zhang Kanjian, Fei Shumin, Journal of Optimization Theory and Applications . 2016,第2期

机译：参数化反馈控制系统中基于势的最小二乘策略迭代
3. Control of a Realistic Wave Energy Converter Model Using Least-Squares Policy Iteration [J] . Enrico Anderlini, David I. M. Forehand, Elva Bannon, Sustainable Energy, IEEE Transactions on . 2017,第4期

机译：使用最小二乘策略迭代控制现实波能转换器模型
4. Adaptive Kernel-Width Selection for Kernel-Based Least-Squares Policy Iteration Algorithm [C] . Jun Wu, Xin Xu, Lei Zuo, ISNN 2011;International symposium on neural networks . 2011

机译：基于内核的最小二乘策略迭代算法的自适应内核宽度选择
5. First-Order Systems Least-Squares Finite Element Methods and Nested Iteration for Electromagnetic Two-Fluid Kinetic-Based Plasma Models. [D] . Leibs, Christopher A. 2014

机译：基于电磁两流体动力学的等离子模型的一阶系统最小二乘有限元方法和嵌套迭代。
6. Gait Phase Classification and Assist Torque Prediction for a Lower Limb Exoskeleton System Using Kernel Recursive Least-Squares Method [O] . Yue Ma, Xinyu Wu, Can Wang, 2019

机译：步态相位分类和使用核递归最小二乘法对下肢外骨骼系统的辅助扭矩预测
7. Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration [O] . Liu, Bo, Chen, Sanfeng, Li, Shuai, 2012

机译：通过核化最小二乘策略迭代对传感器-执行器系统进行智能控制
8. IVHS and Environmental Impacts: Implications of the Operational Tests. NationalPolicy Conference on Intelligent Operational Tests. National Policy Conference on Intelligent Transportation Systems and the Environment. Held in Arlington, Virginia on June 6- [R] . Little, C., Wooster, J. 1994

机译：IVHs和环境影响：运行测试的影响。智能运行测试国家政策会议。智能交通系统与环境国家政策会议。 6月6日在弗吉尼亚州阿灵顿举行 -

Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

摘要

著录项

相似文献

相关主题

期刊订阅