首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2010 >Synthesizing Photo-Real Talking Head via Trajectory-Guided Sample Selection
【24h】

Synthesizing Photo-Real Talking Head via Trajectory-Guided Sample Selection

机译:通过轨迹引导的样本选择合成真实照片的讲话头

获取原文

摘要

In this paper, we propose an HMM trajectory-guided, real image sample concatenation approach to photo-real talking head synthesis. It renders a smooth and natural video of articulators in sync with given speech signals. An audio-visual database is used to train a statistical Hidden Markov Model (HMM) of lips movement first and the trained model is then used to generate a visual parameter trajectory of lips movement for given speech signals, all in the maximum likelihood sense. The HMM generated trajectory is then used as a guide to select, in the original training database, an optimal sequence of mouth images which are then stitched back to a background head video. The whole procedure is fully automatic and data driven. With an audio/video footage as short as 20 minutes from a speaker, the proposed system can synthesize a highly photo-real video in sync with the given speech signals. This system won the FIRST place in the Audio-Visual match contest in LIPS2009 Challenge, which was perceptually evaluated by recruited human subjects.
机译:在本文中,我们提出了一种HMM轨迹引导的真实图像样本级联方法,以实现照片真实的说话人头部合成。它与给定的语音信号同步,呈现清晰清晰的咬合器视频。首先使用视听数据库来训练嘴唇运动的统计隐马尔可夫模型(HMM),然后使用训练后的模型为给定的语音信号生成嘴唇运动的视觉参数轨迹,所有这些都是最大似然感。然后,将HMM生成的轨迹用作指导,以在原始训练数据库中选择最佳的嘴巴图像序列,然后将其缝合回背景头部视频。整个过程是全自动的,并由数据驱动。借助距扬声器仅20分钟的音频/视频镜头,该系统可以与给定的语音信号同步合成高度照片逼真的视频。该系统在LIPS2009挑战赛的视听比赛中赢得了第一名,该奖项由新招募的人类受试者进行了感官评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号