Comparing Cascaded LSTM Architectures for Generating Head Motion from Speech in Task-Oriented Dialogs

机译：比较级联LSTM架构以在面向任务的对话框中从语音生成头部运动

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

To generate action events for a humanoid robot for human robot interaction (HRI), multimodal interactive behavioral models are typically used given observed actions of the human partner(s). In previous research, we built an interactive model to generate discrete events for gaze and arm gestures, which can be used to drive our iCub humanoid robot [19,20]. In this paper, we investigate how to generate continuous head motion in the context of a collaborative scenario where head motion contributes to verbal as well as nonverbal functions. We show that in this scenario, the fundamental frequency of speech (F0 feature) is not enough to drive head motion, while the gaze significantly contributes to the head motion generation. We propose a cascaded Long-Short Term Memory (LSTM) model that first estimates the gaze from speech content and hand gestures performed by the partner. This estimation is further used as input for the generation of the head motion. The results show that the proposed method outperforms a single-task model with the same inputs.

机译：为了生成用于人机交互（HRI）的人型机器人的动作事件，通常在给定人类伙伴观察到的动作的情况下使用多模式交互行为模型。在先前的研究中，我们建立了一个交互式模型来生成针对凝视和手臂手势的离散事件，可用于驱动我们的iCub人形机器人[19,20]。在本文中，我们研究了在头部动作有助于语言和非语言功能的协作场景下如何生成连续的头部动作。我们表明，在这种情况下，语音的基本频率（F0功能）不足以驱动头部运动，而注视则大大有助于头部运动的产生。我们提出了一个级联的长期记忆（LSTM）模型，该模型首先根据对方的语音内容和手势来估计注视。该估计还用作生成头部运动的输入。结果表明，该方法优于具有相同输入的单任务模型。

著录项

来源
《International conference on human-computer interaction》|2018年|164-175|共12页
会议地点
作者
Duc-Canh Nguyen; Gerard Bailly; Frederic Elisei;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Head motion generation; Human interactions Multi-tasks learning; LSTM; Human-robot interaction;

机译：头部运动产生;人际互动多任务学习; LSTM;人机交互;
入库时间 2022-08-26 13:51:26

相似文献

外文文献
中文文献
专利

1. BLSTM and CNN Stacking Architecture for Speech Emotion Recognition [J] . Li Dongdong, Sun Linyu, Xu Xinlei, Neural processing letters . 2021,第6期

机译：BLSTM和CNN堆叠架构用于语音情感识别
2. Application-Independent Knowledge-Processing in a Task-Oriented Speech-Dialog-System [J] . Jurgen te Vrugt, Thomas Portele Information Technology . 2004,第6期

机译：面向任务的语音对话系统中与应用程序无关的知识处理
3. Constructing rule-base toward deep understanding of task-oriented dialogues by emotions [J] . Masato Tokuhisa, Ikue Nakano, Tomoyuki Yamashita, 電子情報通信学会技術研究報告. 思考と言語. Thought and Language . 2001,第484期

机译：通过情感构建对面向任务对话的深刻理解的规则基础
4. Comparing Cascaded LSTM Architectures for Generating Head Motion from Speech in Task-Oriented Dialogs [C] . Duc-Canh Nguyen, Gerard Bailly, Frederic Elisei International Conference on Human-Computer Interaction . 2018

机译：比较级联的LSTM架构在以任务为导向的对话中从语音生成头部动作
5. Cascaded Convolutional Neural Network Architecture for Speech Emotion Recognition in Noisy Conditions [O] . Youngja Nam, Chankyu Lee 2021

机译：级联卷积神经网络架构用于嘈杂的条件下的语音情感识别
6. Comparing Cascaded LSTM Architectures for Generating Head Motion from Speech in Task-Oriented Dialogs [O] . Duc-Canh Nguyen, Gérard Bailly, Frédéric Elisei 2018

机译：比较级联的LSTM架构在以任务为导向的对话中从语音生成头部动作

Comparing Cascaded LSTM Architectures for Generating Head Motion from Speech in Task-Oriented Dialogs

摘要

著录项

相似文献

相关主题

期刊订阅