首页> 外文期刊>ACM Transactions on Graphics >VisemeNet: Audio-Driven Animator-Centric Speech Animation
【24h】

VisemeNet: Audio-Driven Animator-Centric Speech Animation

机译:VisemeNet:音频驱动的以动画师为中心的语音动画

获取原文
获取原文并翻译 | 示例

摘要

We present a novel deep-learning based approach to producing animatorcentric speech motion curves that drive a JALI or standard FACS-based production face-rig, directly from input audio. Our three-stage Long Short-Term Memory (LSTM) network architecture is motivated by psycho-linguistic insights: segmenting speech audio into a stream of phonetic-groups is sufficient for viseme construction; speech styles like mumbling or shouting are strongly co-related to the motion of facial landmarks; and animator style is encoded in viseme motion curve profiles. Our contribution is an automatic real-time lip-synchronization from audio solution that integrates seamlessly into existing animation pipelines. We evaluate our results by: cross-validation to ground-truth data; animator critique and edits; visual comparison to recent deep-learning lip-synchronization solutions; and showing our approach to be resilient to diversity in speaker and language.
机译:我们提出了一种新颖的基于深度学习的方法来生成以动画师为中心的语音运动曲线,该曲线直接从输入音频中驱动JALI或基于FACS的标准生产面部装备。我们的三阶段长期短期记忆(LSTM)网络体系结构受心理语言见解的驱动:将语音音频分割成语音组流足以实现视位素构建;喃喃自语或大喊大叫等言语风格与面部标志的运动密切相关;动画样式在viseme运动曲线轮廓中编码。我们的贡献是来自音频解决方案的自动实时口型同步,可无缝集成到现有动画管道中。我们通过以下方式评估我们的结果:对真实数据的交叉验证;动画师批判和编辑;与最新的深度学习口型同步解决方案进行视觉比较;并展示了我们在说话人和语言多样性方面具有弹性的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号