首页> 外文学位 >State-aggregation algorithms for learning probabilistic models for robot control.

【24h】

State-aggregation algorithms for learning probabilistic models for robot control.

机译：状态聚集算法，用于学习机器人控制的概率模型。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This thesis addresses the problem of learning probabilistic representations of dynamical systems with non-linear dynamics and hidden state in the form of partially observable Markov decision process (PON-IDP) models, with the explicit purpose of using these models for robot control. In contrast to the usual approach to learning probabilistic models, which is based on iterative adjustment of probabilities so as to improve the likelihood of the observed data, the algorithms proposed in this thesis take a different approach—they reduce the learning problem to that of state aggregation by clustering in an embedding space of delayed coordinates, and subsequently estimating transition probabilities between aggregated states (clusters). This approach has close ties to the dominant methods for system identification in the field of control engineering, although the characteristics of POMDP models require very different algorithmic solutions.; Apart from an extensive investigation of the performance of the proposed algorithms in simulation, they are also applied to two robots built in the course of our experiments. The first one is a differential-drive mobile robot with a minimal number of proximity sensors, which has to perform the well-known robotic task of self-localization along the perimeter of its workspace. In comparison to previous neural-net based approaches to the same problem, our algorithm achieved much higher spatial accuracy of localization. The other task is visual servo-control of an under-actuated arm which has to rotate a flying ball attached to it so as to maintain maximal height of rotation with minimal energy expenditure. Even though this problem is intractable for known control engineering methods due to its strongly non-linear dynamics and partially observable state, a control policy obtained by means of policy iteration on a POMDP model learned by our state-aggregation algorithm performed better than several alternative open-loop and closed-loop controllers.

机译：本文以部分可观察的马尔可夫决策过程（PON-IDP）模型的形式解决了学习具有非线性动力学和隐藏状态的动力学系统的概率表示的问题，其明确目的是将这些模型用于机器人控制。与基于概率迭代调整以提高观测数据的可能性的通常的概率模型学习方法相反，本文提出的算法采用了另一种方法，即将学习问题减少到状态问题。通过聚集在延迟坐标的嵌入空间中进行聚集，然后估计聚集状态（群集）之间的转移概率。尽管POMDP模型的特性需要非常不同的算法解决方案，但这种方法与控制工程领域的主流系统识别方法紧密相关。除了广泛研究所提出算法在仿真中的性能外，它们还被应用于我们实验过程中构建的两个机器人。第一个是具有最少数量的接近传感器的差速驱动移动机器人，该机器人必须执行沿其工作区周边进行自动定位的众所周知的机器人任务。与以前针对相同问题的基于神经网络的方法相比，我们的算法实现了更高的定位空间精度。另一个任务是对欠驱动臂的视觉伺服控制，该臂必须旋转连接到其上的飞球，以便以最小的能量消耗保持最大的旋转高度。尽管此问题由于其强烈的非线性动力学和部分可观察的状态而对于已知的控制工程方法来说是棘手的，但通过状态聚合算法学习的POMDP模型上通过策略迭代获得的控制策略的性能要好于其他几种开放方法闭环和闭环控制器。

著录项

作者
Nikovski, Daniel Nikolaev.;
展开▼
作者单位

Carnegie Mellon University.;

展开▼
授予单位 Carnegie Mellon University.;
学科 Computer Science.; Engineering System Science.
学位 Ph.D.
年度 2002
页码 165 p.
总页数 165
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;系统科学;
关键词

相似文献

外文文献
中文文献
专利

1. Probabilistic Approach to Modeling and Parameter Learning of Indirect Drive Robots From Incomplete Data [J] . Lin Chung-Yen, Tomizuka Masayoshi Mechatronics, IEEE/ASME Transactions on . 2015,第3期

机译：基于不完整数据的间接驱动机器人建模与参数学习的概率方法
2. Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms [J] . Dupont P, Denis E, Esposito Y Pattern Recognition: The Journal of the Pattern Recognition Society . 2005,第9期

机译：概率自动机与隐马尔可夫模型之间的联系：概率分布，学习模型和归纳算法
3. On-line regression algorithms for learning mechanical models of robots: A survey [J] . Olivier Sigaud, Camille Salaun, Vincent Padois Robotics and Autonomous Systems . 2011,第12期

机译：用于学习机器人力学模型的在线回归算法：一项调查
4. Learning Probabilistic Multi-Modal Actor Models for Vision-Based Robotic Grasping [C] . Mengyuan Yan, Adrian Li, Mrinal Kalakrishnan, International Conference on Robotics and Automation . 2019

机译：学习基于视觉的机器人抓取的概率多模态演员模型
5. Neuro-Evolution Using Recombinational Algorithms and Embryogenesis for Robotic Control. [D] . Roy, Anthony M. 2010

机译：使用重组算法的神经进化和用于机器人控制的胚胎发生。
6. Correction: Learning Probabilistic Features for Robotic Navigation Using Laser Sensors [O] . -1

机译：更正：使用激光传感器学习机器人导航的概率特征
7. On-line regression algorithms for learning mechanical models of robots: a survey [O] . Sigaud, Olivier, Salaün, Camille, Padois, Vincent 2011

机译：用于学习机器人力学模型的在线回归算法：一项调查

State-aggregation algorithms for learning probabilistic models for robot control.

摘要

著录项

相似文献

相关主题

期刊订阅