Synthesizing Photo-Real Talking Head via Trajectory-Guided Sample Selection

机译：通过轨迹引导样品选择综合照片实谈的头部

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose an HMM trajectory-guided, real image sample concatenation approach to photo-real talking head synthesis. It renders a smooth and natural video of articulators in sync with given speech signals. An audio-visual database is used to train a statistical Hidden Markov Model (HMM) of lips movement first and the trained model is then used to generate a visual parameter trajectory of lips movement for given speech signals, all in the maximum likelihood sense. The HMM generated trajectory is then used as a guide to select, in the original training database, an optimal sequence of mouth images which are then stitched back to a background head video. The whole procedure is fully automatic and data driven. With an audio/video footage as short as 20 minutes from a speaker, the proposed system can synthesize a highly photo-real video in sync with the given speech signals. This system won the FIRST place in the Audio-Visual match contest in LIPS2009 Challenge, which was perceptually evaluated by recruited human subjects.

机译：在本文中，我们提出了一种迁移的轨迹引导，真实的图像样本串联方法，可在光真谈话中综合。它与给定语音信号同步呈现出铰接器的平滑和自然视频。音频视觉数据库用于首先训练嘴唇运动的统计隐马尔可夫模型（HMM），然后训练模型用于为给定语音信号产生嘴唇运动的视觉参数轨迹，所有这些都在最大似然意义上。然后将HMM生成的轨迹用作在原始训练数据库中选择的指导，该嘴唇图像的最佳序列图像缝合回背景头视频。整个过程是全自动和数据驱动的。通过距离扬声器短至20分钟的音频/视频镜头，所提出的系统可以与给定的语音信号同步合成高度光实视频。该系统在Lips2009挑战中获得了视听比赛比赛的第一名，这是由招募人类受试者进行感知的。

著录项

来源
《Annual conference of the International Speech Communication Association》|2010年||共4页
会议地点
作者
Lijuan Wang; Xiaojun Qian; Wei Han; Frank K. Soong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
visual speech synthesis; photo-real; talking head; trajectory-guided;

机译：视觉语音合成;照片真实;谈话;轨迹引导;

相似文献

外文文献
中文文献
专利

1. HMM trajectory-guided sample selection for photo-realistic talking head [J] . Wang Lijuan, Soong Frank K. Multimedia Tools and Applications . 2015,第22期

机译：HMM轨迹引导的用于真实感谈话头的样本选择
2. Photo-realistic talking-heads from image samples [J] . Cosatto E., Graf H.P. IEEE transactions on multimedia . 2000,第3期

机译：来自图像样本的逼真的谈话头
3. Synthesized Cluster Head Selection and Routing for Two Tier Wireless Sensor Network [J] . Keyur Rana, Mukesh Zaveri Journal of computer networks and communications . 2013,第期

机译：两层无线传感器网络的综合簇头选择和路由
4. Synthesizing Photo-Real Talking Head via Trajectory-Guided Sample Selection [C] . Lijuan Wang, Xiaojun Qian, Wei Han, Annual conference of the International Speech Communication Association;INTERSPEECH 2010 . 2011

机译：通过轨迹引导的样本选择合成真实照片的讲话头
5. Pairing media-captured human versus computer-synthesized humanoid faces and voices for talking heads: A consistency theory for interface agents. [D] . Gong, Li. 2001

机译：将媒体捕获的人与计算机合成的人形面部和声音配对以用于说话人：接口代理的一致性理论。
6. Estimates of diet selection in cattle grazing cornstalk residues by measurement of chemical composition and near infrared reflectance spectroscopy of diet samples collected by ruminal evacuation [O] . Emily A Petzel, Alexander J Smart, Benoit St-Pierre, 2018

机译：通过测量瘤胃抽空收集的饲料样品的化学成分和近红外反射光谱法估计吃草玉米秸秆残留物的牛的饲料选择
7. Rendering A Personalized Photo-Real Talking Head from Short Video Footage [O] . Lijuan Wang, Wei Han, Xiaojun Qian, 2013

机译：从短视频画面渲染个性化的照片真实说话头

Synthesizing Photo-Real Talking Head via Trajectory-Guided Sample Selection

摘要

著录项

相似文献

相关主题

期刊订阅