首页> 外文期刊>ACM Transactions on Graphics >Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion
【24h】

Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion

机译:通过端到端的姿势和情感联合学习进行音频驱动的面部动画

获取原文
获取原文并翻译 | 示例

摘要

We present a machine learning technique for driving 3D facial animation by audio input in real time and with low latency. Our deep neural network learns a mapping from input waveforms to the 3D vertex coordinates of a face model, and simultaneously discovers a compact, latent code that disambiguates the variations in facial expression that cannot be explained by the audio alone. During inference, the latent code can be used as an intuitive control for the emotional state of the face puppet. We train our network with 3–5 minutes of high-quality animation data obtained using traditional, vision-based performance capture methods. Even though our primary goal is to model the speaking style of a single actor, our model yields reasonable results even when driven with audio from other speakers with different gender, accent, or language, as we demonstrate with a user study. The results are applicable to in-game dialogue, low-cost localization, virtual reality avatars, and telepresence.
机译:我们提出了一种机器学习技术,可通过音频输入实时且低延迟地驱动3D面部动画。我们的深度神经网络学习从输入波形到面部模型的3D顶点坐标的映射,并同时发现一个紧凑的潜在代码,该代码可以消除无法由音频单独解释的面部表情变化。在推理过程中,潜在代码可以用作脸部木偶情绪状态的直观控件。我们使用传统的基于视觉的性能捕获方法获得的3-5分钟的高质量动画数据来训练我们的网络。即使我们的主要目标是为单个演员的讲话风格建模,但正如我们在用户研究中所展示的那样,即使在其他性别,口音或语言不同的其他说话者的音频驱动下,我们的模型也能产生合理的结果。结果适用于游戏中对话,低成本本地化,虚拟现实头像和网真。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号