首页> 外文会议>International conference on human-computer interaction >Comparing Cascaded LSTM Architectures for Generating Head Motion from Speech in Task-Oriented Dialogs
【24h】

Comparing Cascaded LSTM Architectures for Generating Head Motion from Speech in Task-Oriented Dialogs

机译:比较级联LSTM架构以在面向任务的对话框中从语音生成头部运动

获取原文

摘要

To generate action events for a humanoid robot for human robot interaction (HRI), multimodal interactive behavioral models are typically used given observed actions of the human partner(s). In previous research, we built an interactive model to generate discrete events for gaze and arm gestures, which can be used to drive our iCub humanoid robot [19,20]. In this paper, we investigate how to generate continuous head motion in the context of a collaborative scenario where head motion contributes to verbal as well as nonverbal functions. We show that in this scenario, the fundamental frequency of speech (F0 feature) is not enough to drive head motion, while the gaze significantly contributes to the head motion generation. We propose a cascaded Long-Short Term Memory (LSTM) model that first estimates the gaze from speech content and hand gestures performed by the partner. This estimation is further used as input for the generation of the head motion. The results show that the proposed method outperforms a single-task model with the same inputs.
机译:为了生成用于人机交互(HRI)的人型机器人的动作事件,通常在给定人类伙伴观察到的动作的情况下使用多模式交互行为模型。在先前的研究中,我们建立了一个交互式模型来生成针对凝视和手臂手势的离散事件,可用于驱动我们的iCub人形机器人[19,20]。在本文中,我们研究了在头部动作有助于语言和非语言功能的协作场景下如何生成连续的头部动作。我们表明,在这种情况下,语音的基本频率(F0功能)不足以驱动头部运动,而注视则大大有助于头部运动的产生。我们提出了一个级联的长期记忆(LSTM)模型,该模型首先根据对方的语音内容和手势来估计注视。该估计还用作生成头部运动的输入。结果表明,该方法优于具有相同输入的单任务模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号