首页> 外文期刊>IEEE Robotics and Automation Letters >Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks
【24h】

Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks

机译:使用图网络从双手演示中学习对象-动作关系

获取原文
获取原文并翻译 | 示例

摘要

Recognizing human actions is a vital task for a humanoid robot, especially in domains like programming by demonstration. Previous approaches on action recognition primarily focused on the overall prevalent action being executed, but we argue that bimanual human motion cannot always be described sufficiently with a single action label. We present a system for frame-wise action classification and segmentation in bimanual human demonstrations. The system extracts symbolic spatial object relations from raw rgb-d video data captured from the robots point of view in order to build graph-based scene representations. To learn object-action relations, a graph network classifier is trained using these representations together with ground truth action labels to predict the action executed by each hand. We evaluated the proposed classifier on a new rgb-d video dataset showing daily action sequences focusing on bimanual manipulation actions. It consists of 6 subjects performing 9 tasks with 10 repetitions each, which leads to 540 video recordings with 2 hours and 18 minutes total playtime and per-hand ground truth action labels for each frame. We show that the classifier is able to reliably identify (action classification macro $F_1$-score of 0.86) the true executed action of each hand within its top 3 predictions on a frame-by-frame basis without prior temporal action segmentation.
机译:识别人的动作是类人机器人的一项至关重要的任务,尤其是在通过演示编程等领域。先前关于动作识别的方法主要集中在执行的总体普遍动作上,但是我们认为,双手动作不能总是用单个动作标签来充分描述。我们提出了一种在人类双手演示中进行逐帧动作分类和分割的系统。该系统从机器人角度捕获的原始rgb-d视频数据中提取符号空间对象关系,以构建基于图形的场景表示。为了学习对象与动作之间的关系,使用这些表示法和地面真实动作标签来训练图网络分类器,以预测每只手执行的动作。我们在新的rgb-d视频数据集上评估了建议的分类器,该数据集显示了集中于双手操作的日常动作序列。它由6名受试者组成,他们执行9个任务,每个任务有10次重复,从而产生540个视频录像,总播放时间为2小时18分钟,每帧都具有手动地面真相动作标签。我们证明了分类器能够在逐帧的基础上可靠地识别(在0.86的动作分类宏$ F_1 $得分中)每只手的前3个预测内的真实执行动作,而无需事先进行时间动作分割。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号