首页> 外文会议>International Workshop on Spoken Dialogue Systems >Caption Generation of Robot Behaviors Based on Unsupervised Learning of Action Segments
【24h】

Caption Generation of Robot Behaviors Based on Unsupervised Learning of Action Segments

机译:基于无监督段的无监督学习的机器人行为的标题生成

获取原文

摘要

Bridging robot action sequences and their natural language captions is an important task to increase explainability of human assisting robots in their recently evolving field. In this paper, we propose a system for generating natural language captions that describe behaviors of human assisting robots. The system describes robot actions by using robot observations; histories from actuator systems and cameras, toward end-to-end bridging between robot actions and natural language captions. Two reasons make it challenging to apply existing sequence-to-sequence models to this mapping: (1) it is hard to prepare a large-scale dataset for any kinds of robots and their environment, and (2) there is a gap between the number of samples obtained from robot action observations and generated word sequences of captions. We introduced unsupervised segmentation based on K-means clustering to unify typical robot observation patterns into a class. This method makes it possible for the network to learn the relationship from a small amount of data. Moreover, we utilized a chunking method based on byte-pair encoding (BPE) to fill in the gap between the number of samples of robot action observations and words in a caption. We also applied an attention mechanism to the segmentation task. Experimental results show that the proposed model based on unsupervised learning can generate better descriptions than other methods. We also show that the attention mechanism did not work well in our low-resource setting.
机译:缩小机器人的动作序列及其自然语言字幕是为了提高人类辅助机器人explainability在其最近发展领域的一项重要任务。在本文中,我们提出了产生自然语言的字幕,描述人类辅助机器人的行为的系统。该系统通过使用机器人的观测描述机器人的行动;从致动器系统和摄像机,朝机器人的行动和自然语言字幕之间的端至端桥接历史。有两个原因使得它具有挑战性的应用现有的序列对序列模型的映射:(1)它是努力准备对任何种类的机器人和它们环境中的大型数据集,以及(2)存在之间的间隙从机器人动作的观察和字幕的生成的词序列而获得的样本数。我们介绍了基于K-means聚类来统一典型的机器人观察模式到一个类无监督分割。该方法使得可能的是,网络学习从数据量小的关系。此外,我们使用基于字节对编码(BPE)分块方法填补在字幕机器人动作观察样品和词的数量之间的差距。我们还应用了注意机制来分割任务。实验结果表明,基于无监督学习的模型可以产生比其他方法更好的描述。我们还表明,重视机制并没有在我们的低资源设置很好地工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号