首页> 外文会议>IEEE International Conference on Robotics and Automation >Learning individual motion preferences from audience feedback of motion sequences
【24h】

Learning individual motion preferences from audience feedback of motion sequences

机译:从观众对动作序列的反馈中学习个人的动作偏好

获取原文

摘要

A robot performs a sequence of motions to animate a given input, e.g., dancing to music or telling a story. Each input is pre-processed to determine labels, e.g., emotions of the music or words in the story. Each label corresponds to multiple motions, and each motion has multiple labels. Therefore, the robot can choose one sequence from multiple motion sequences to animate the input. We aim to choose the best sequence to animate based on the audience's preferences. The audience prefers some motions over others, and each motion has an initially unknown preference value. At the end of the motion sequence, the audience provides feedback which is the sum of the motions' preference values. However, the observation of the feedback is noisy due to the device used to capture the audience's feedback. To select the most preferred sequence, the robot has to determine the sequence to query the audience with, so as to learn the preference values of individual motions from noisy observations of the audience's feedback. By learning the individual motion preference values, the most preferred sequence can be determined. Moreover, the audience may get bored of watching the same single motion in multiple sequences and the preference value will degrade based on the number of times the motion is viewed. We contribute MAK (Multi-Armed bandit and Kalman filter) and show that MAK outperforms least squares regression in selecting the best sequence with lower degradation in our simulation experiments.
机译:机器人执行一系列动作来对给定的输入进行动画处理,例如,随着音乐跳舞或讲故事。每个输入都经过预处理以确定标签,例如音乐中的情感或故事中的单词。每个标签对应于多个动作,并且每个动作具有多个标签。因此,机器人可以从多个运动序列中选择一个序列来对输入进行动画处理。我们旨在根据观众的喜好选择最佳的动画序列。观众比其他人更喜欢某些动作,并且每个动作都有一个最初未知的偏好值。在动作序列的最后,听众提供反馈,该反馈是动作偏好值的总和。但是,由于使用了用于捕获听众反馈的设备,因此对反馈的观察非常嘈杂。为了选择最优选的序列,机器人必须确定用于查询观众的序列,以便从对观众反馈的嘈杂观察中获悉单个动作的偏好值。通过学习各个运动偏好值,可以确定最优选的顺序。而且,观众可能会无聊地观看多个序列中的同一动作,并且偏好值将基于观看动作的次数而降低。我们贡献了MAK(多臂强盗和卡尔曼滤波器),并表明在我们的模拟实验中,MAK在选择具有较低降级的最佳序列方面胜过了最小二乘回归。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号