Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations

机译：使用跨模型表示学习用于空中航行的visuomotor政策

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Machines are a long way from robustly solving open-world perception-control tasks, such as first-person view (FPV) aerial navigation. While recent advances in end-to- end Machine Learning, especially Imitation Learning and Reinforcement appear promising, they are constrained by the need of large amounts of difficult-to-collect labeled real- world data. Simulated data, on the other hand, is easy to generate, but generally does not render safe behaviors in diverse real-life scenarios. In this work we propose a novel method for learning robust visuomotor policies for real-world deployment which can be trained purely with simulated data. We develop rich state representations that combine supervised and unsupervised environment data. Our approach takes a cross-modal perspective, where separate modalities correspond to the raw camera data and the system states relevant to the task, such as the relative pose of gates to the drone in the case of drone racing. We feed both data modalities into a novel factored architecture, which learns a joint lowdimensional embedding via Variational Auto Encoders. This compact representation is then fed into a control policy, which we trained using imitation learning with expert trajectories in a simulator. We analyze the rich latent spaces learned with our proposed representations, and show that the use of our cross-modal architecture significantly improves control policy performance as compared to end-to-end learning or purely unsupervised feature extractors. We also present real-world results for drone navigation through gates in different track configurations and environmental conditions. Our proposed method, which runs fully onboard, can successfully generalize the learned representations and policies across simulation and reality, significantly outperforming baseline approaches.

机译：机器是一种很长的方式，从强大地解决了开放世界的感知控制任务，例如第一人称视图（FPV）航空导航。虽然最近的端到端机器学习的进步，但尤其是仿制学习和强化看起来很有希望，但它们受到大量难以收集的标记的真实数据的限制。另一方面，模拟数据易于生成，但通常不会在不同的现实生活场景中呈现安全行为。在这项工作中，我们提出了一种新的学习真实探索政策的新方法，可以纯粹地使用模拟数据训练。我们开发丰富的状态表示，将监督和无人监督的环境数据组合起来。我们的方法采用跨模型透视图，其中单独的模态对应于原始摄像机数据和系统状态，与任务相关，例如在无人机赛车的情况下对无人机的相对姿势。我们将数据模式送入新颖的因素架构，该架构通过变分式自动编码器来学习嵌入的联合嵌入。然后将这种紧凑的表示送入了控制策略，我们使用模拟器中的专家轨迹进行了模仿学习培训。我们分析了我们提出的陈述的丰富潜在的空间，并表明，与端到端学习或纯粹无监督的特征提取器相比，我们的跨模式架构的使用显着提高了控制策略性能。我们还通过不同的轨道配置和环境条件，通过盖茨提供无人机导航的现实世界。我们完全运行的建议方法可以成功地概括跨模拟和现实的学习陈述和政策，显着优于基线方法。

著录项

来源
《IEEE/RSJ International Conference on Intelligent Robots and Systems》|2020年|1637-1644|共8页
会议地点
作者
Rogerio Bonatti; Ratnesh Madaan; Vibhav Vineet; Sebastian Scherer; Ashish Kapoor;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Navigation; Machine learning; Logic gates; Feature extraction; Trajectory; Task analysis; Drones;

机译：导航;机器学习;逻辑门;特征提取;轨迹;任务分析;无人机;

相似文献

外文文献
中文文献
专利

1. Visuomotor representations within the human primary motor cortex: the elusive markers of visuomotor associative learning. [J] . Taschereau Dumouchel V, Hetu S The Journal of Neuroscience: The Official Journal of the Society for Neuroscience . 2012,第3期

机译：人体主要运动皮层内的视觉运动表征：视觉运动联想学习的难以捉摸的标记。
2. Learning from Learning: What Can Visuomotor Adaptations Tell us About the Neuronal Representation of Movement? [J] . Rony Paz, Eilon Vaadia Advances in Experimental Medicine and Biology . 2009,第Null期

机译：从学习中学习：视觉运动适应可以告诉我们关于运动的神经元表示的什么信息？
3. Learning from learning: what can visuomotor adaptations tell us about the neuronal representation of movement? [J] . Paz R, Vaadia E Advances in Experimental Medicine and Biology . 2009,第0期

机译：从学习中学习：视觉运动适应能告诉我们关于运动的神经元表示的什么信息？
4. Learning Deep Visuomotor Policies for Dexterous Hand Manipulation [C] . Divye Jain, Andrew Li, Shivam Singhal, International Conference on Robotics and Automation . 2019

机译：学习用于手部操纵的深层运动策略
5. Learning Distributed Representations From Network Data and Human Navigation [D] . Wu, Hao. 2017

机译：从网络数据和人工导航中学习分布式表示
6. Visuomotor Representations within the Human Primary Motor Cortex: The Elusive Markers of Visuomotor Associative Learning [O] . Vincent Taschereau-Dumouchel, Sébastien Hétu 2012

机译：人类初级运动皮层内的运动运动表征：运动运动联想学习的难以捉摸的标记。
7. Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations [O] . Rogerio Bonatti, Ratnesh Madaan, Vibhav Vineet, 2020

机译：使用跨模型表示学习用于空中航行的visuomotor政策

Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations

摘要

著录项

相似文献

相关主题

期刊订阅