Multimodal Speech Driven Facial Shape Animation Using Deep Neural Networks

机译：使用深度神经网络的多峰语音驱动面部形状动画

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we present a deep learning multimodal approach for speech driven generation of face animations. Training a speaker independent model, capable of generating different emotions of the speaker, is crucial for realistic animations. Unlike the previous approaches which either use acoustic features or phoneme label features to estimate the facial movements, we utilize both modalities to generate natural looking speaker independent lip animations synchronized with affective speech. A phoneme-based model qualifies generation of speaker independent animation, whereas an acoustic feature-based model enables capturing affective variation during the animation generation. We show that our multimodal approach not only performs significantly better on affective data, but improves performance over neutral data as well. We evaluate the proposed multimodal speech-driven animation model using two large scale datasets, GRID and SAVEE, by reporting the mean squared error (MSE) over various network structures.

机译：在本文中，我们为脸部动画的语音驱动产生了深度学习多模式方法。培训一个能够产生扬声器的不同情绪的扬声器独立模型对现实动画至关重要。与使用声学特征或音素标签的方法来估计面部运动的方法不同，我们使用两种方式生成与情感语音同步的自然观看的扬声器独立的唇动画。基于音素的模型符合扬声器独立动画的生成，而基于声学特征的模型可以在动画生成期间捕获有影响变化。我们表明，我们的多模式方法不仅对情感数据进行了明显更好的表现，而且还提高了中性数据的性能。我们通过在各种网络结构上报告平均平方误差（MSE）来评估使用两个大型数据集，网格和Savee来评估所提出的多模式语音驱动的动画模型。

著录项

来源
《Asia-Pacific Signal and Information Processing Association Annual Summit and Conference》|2018年|1508-1512|共5页
会议地点
作者
Sasan Asadiabadi; Rizwan Sadiq; Engin Erzin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Shape; Feature extraction; Visualization; Training; Acoustics; Facial animation;

机译：形状;特征提取;可视化;训练;声学;面部动画;

相似文献

外文文献
中文文献
专利

1. Real-time speech-driven face animation with expressions using neural networks [J] . Pengyu Hong, Zhen Wen, Huang T.S. IEEE Transactions on Neural Networks . 2002,第4期

机译：使用神经网络的表情实时语音驱动的面部动画
2. Interactive facial animation with deep neural networks [J] . Wolfgang Paier, Anna Hilsmann, Peter Eisert Computer Vision, IET . 2020,第6期

机译：具有深度神经网络的互动面部动画
3. SynFace-Speech-Driven Facial Animation for Virtual Speech-Reading Support [J] . Giampiero Salvi, rnJonas Beskow, rnSamer Al Moubayed, EURASIP journal on audio, speech, and music processing . 2009,第suppla期

机译：SynFace语音驱动的面部动画，支持虚拟语音阅读
4. Multimodal Speech Driven Facial Shape Animation Using Deep Neural Networks [C] . Sasan Asadiabadi, Rizwan Sadiq, Engin Erzin Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2018

机译：多模式语音驱动的面部形状动画使用深神经网络
5. Expressive speech-driven facial animation. [D] . Cao, Yong. 2005

机译：富有表现力的语音驱动面部动画。
6. Multi-resolution speech analysis for automatic speech recognition using deep neural networks: Experiments on TIMIT [O] . Doroteo T. Toledano, María Pilar Fernández-Gallego, Alicia Lozano-Diez 2012

机译：基于深度神经网络的自动语音识别的多分辨率语音分析：TIMIT实验
7. Real-time Speech-driven Face Animation with Expressions Using Neural Networks [O] . Pengyu Hong, Zhen Wen, Thomas S. Huang 2002

机译：使用神经网络的表情实时语音驱动的面部动画

Multimodal Speech Driven Facial Shape Animation Using Deep Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅