An Online Algorithm for Smoothed Regression and LQR Control

Gautam Goel; Adam Wierman

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >An Online Algorithm for Smoothed Regression and LQR Control

【24h】

An Online Algorithm for Smoothed Regression and LQR Control

机译：平滑回归和LQR控制的在线算法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider Online Convex Optimization (OCO) in the setting where the costs are $m$-strongly convex and the online learner pays a switching cost for changing decisions between rounds. We show that the recently proposed Online Balanced Descent (OBD) algorithm is constant competitive in this setting, with competitive ratio $3 + O(1/m)$, irrespective of the ambient dimension. Additionally, we show that when the sequence of cost functions is $epsilon$-smooth, OBD has near-optimal dynamic regret and maintains strong per-round accuracy. We demonstrate the generality of our approach by showing that the OBD framework can be used to construct competitive algorithms for a variety of online problems across learning and control, including online variants of ridge regression, logistic regression, maximum likelihood estimation, and LQR control.

机译：我们考虑在线凸优化（OCO），在这种情况下成本为$ m $-凸，而在线学习者为轮次之间的决策变更支付转换成本。我们表明，最近提出的在线平衡下降（OBD）算法在这种情况下具有恒定的竞争能力，无论环境尺寸如何，竞争比为$ 3 + O（1 / m）$。此外，我们证明，当成本函数序列为ε平滑时，OBD具有近乎最佳的动态后悔效果，并保持了很高的每轮准确性。我们通过证明OBD框架可用于构建跨学习和控制的各种在线问题的竞争算法来证明我们方法的通用性，这些算法包括岭回归，逻辑回归，最大似然估计和LQR控制的在线变体。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第12期|共10页
作者
Gautam Goel; Adam Wierman;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Convergence and numerical stability of action-dependent heuristic dynamic programming algorithms based on RLS learning for online DLQR optimal control [J] . Guilherme Bonfim De Sousa, Patrícia Helena Moraes Rêgo International Journal of Computational Science and Engineering . 2019,第3期

机译：基于RLS学习在线DLQR最优控制的动态启发式动态规划算法的收敛性和数值稳定性
2. Hybrid control algorithm based on LQR and genetic algorithm for active support weight compensation system [J] . A.S. Belyaev, O.Yu. Sumenkov IFAC PapersOnLine . 2021,第13期

机译：基于LQR和遗传算法的活性支持重量补偿系统混合控制算法
3. Performance comparison of optimization algorithms in LQR controller design for a nonlinear system [J] . ümit ?NEN, Abdullah ?AKAN, ?lhan ?LHAN Turkish Journal of Electrical Engineering and Computer Sciences . 2019,第3期

机译：非线性系统LQR控制器设计中优化算法的性能比较
4. RLS Algorithms and Convergence Analysis Method for Online DLQR Control Design via Heuristic Dynamic Programming [C] . Santos Watson R. M., Queiroz Jonathan /A/., Viana da F Neto João, UKSim-AMSS International Conference on Computer Modelling and Simulation . 2014

机译：启发式动态规划的在线DLQR控制设计的RLS算法和收敛性分析方法
5. Variants of multivariate adaptive regression splines (MARS): Convex vs. nonconvex, piecewise-linear vs. smooth and sequential algorithms. [D] . Martinez Cepeda, Diana Luisa. 2013

机译：多元自适应回归样条（MARS）的变体：凸与非凸，分段线性与平滑和顺序算法。
6. Propensity score adjustment using machine learning classification algorithms to control selection bias in online surveys [O] . Ramón Ferri-García, María del Mar Rueda 2020

机译：使用机器学习分类算法来控制在线调查中选择偏差的倾向分数调整
7. Performance comparison of optimization algorithms in LQR controller design fora nonlinear system [O] . Ümit ÖNEN, Abdullah ÇAKAN, İlhan İLHAN 2019

机译：LQR控制器设计中优化算法的性能比较非线性系统

An Online Algorithm for Smoothed Regression and LQR Control

摘要

著录项

相似文献

相关主题

期刊订阅