首页> 外文期刊>International Journal of Computer Vision >Realistic Speech-Driven Facial Animation with GANs
【24h】

Realistic Speech-Driven Facial Animation with GANs

机译:与GANS的现实演讲驱动的面部动画

获取原文
获取原文并翻译 | 示例
       

摘要

Speech-driven facial animation is the process that automatically synthesizes talking characters based on speech signals. The majority of work in this domain creates a mapping from audio features to visual features. This approach often requires post-processing using computer graphics techniques to produce realistic albeit subject dependent results. We present an end-to-end system that generates videos of a talking head, using only a still image of a person and an audio clip containing speech, without relying on handcrafted intermediate features. Our method generates videos which have (a) lip movements that are in sync with the audio and (b) natural facial expressions such as blinks and eyebrow movements. Our temporal GAN uses 3 discriminators focused on achieving detailed frames, audio-visual synchronization, and realistic expressions. We quantify the contribution of each component in our model using an ablation study and we provide insights into the latent representation of the model. The generated videos are evaluated based on sharpness, reconstruction quality, lip-reading accuracy, synchronization as well as their ability to generate natural blinks.
机译:语音驱动的面部动画是基于语音信号自动综合谈话字符的过程。此域中的大部分工作从音频功能映射到可视化功能。这种方法通常需要使用计算机图形技术后处理,以产生现实的虽然主题依赖结果。我们介绍了一个端到端系统,它仅使用一个人的静止图像和包含语音的音频剪辑来生成谈话头的视频,而无需依赖手工制作的中间功能。我们的方法生成有(a)与音频和(b)自然面部表达式同步的唇部运动,例如闪烁和眉毛运动。我们的时间GaN使用3个鉴别者,专注于实现详细帧,视听同步和现实表达式。我们使用烧蚀研究量化我们模型中每个组件的贡献,并提供对模型的潜在表示的见解。基于清晰度,重建质量,唇读精度,同步以及它们生成自然闪烁的能力来评估生成的视频。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号