首页> 外文期刊>Robotics and Autonomous Systems >Multimodal integration learning of robot behavior using deep neural networks
【24h】

Multimodal integration learning of robot behavior using deep neural networks

机译:基于深度神经网络的机器人行为多模态集成学习

获取原文
获取原文并翻译 | 示例

摘要

For humans to accurately understand the world around them, multimodal integration is essential because it enhances perceptual precision and reduces ambiguity. Computational models replicating such human ability may contribute to the practical use of robots in daily human living environments; however, primarily because of scalability problems that conventional machine learning algorithms suffer from, sensory-motor information processing in robotic applications has typically been achieved via modal-dependent processes. In this paper, we propose a novel computational framework enabling the integration of sensory-motor time-series data and the self-organization of multimodal fused representations based on a deep learning approach. To evaluate our proposed model, we conducted two behavior-learning experiments utilizing a humanoid robot; the experiments consisted of object manipulation and bell-ringing tasks. From our experimental results, we show that large amounts of sensory-motor information, including raw RGB images, sound spectrums, and joint angles, are directly fused to generate higher-level multimodal representations. Further, we demonstrated that our proposed framework realizes the following three functions: (1) cross-modal memory retrieval utilizing the information complementation capability of the deep autoencoder; (2) noise-robust behavior recognition utilizing the generalization capability of multimodal features; and (3) multimodal causality acquisition and sensory-motor prediction based on the acquired causality.
机译:对于人类来说,准确地了解周围的世界非常重要,因为多峰集成可以提高感知精度并减少歧义。复制这种人类能力的计算模型可能有助于机器人在日常生活中的实际使用。然而,主要由于常规机器学习算法所遭受的可伸缩性问题,机器人应用中的感觉运动信息处理通常是通过依赖于模态的过程来实现的。在本文中,我们提出了一种新颖的计算框架,该框架能够基于深度学习方法整合感觉运动时间序列数据和多模式融合表示的自组织。为了评估我们提出的模型,我们使用类人机器人进行了两次行为学习实验。实验包括对象操纵和响铃任务。从我们的实验结果中,我们发现大量的感觉运动信息,包括原始的RGB图像,声谱和关节角度,被直接融合以生成更高级别的多峰表示。此外,我们证明了我们提出的框架实现了以下三个功能:(1)利用深度自动编码器的信息补充功能进行跨模式存储器检索; (2)利用多模态特征的泛化能力进行鲁棒行为识别; (3)多模态因果关系获取和基于所获取因果关系的感觉运动预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号