Synthesizing Photo-Real Talking Head via Trajectory-Guided Sample Selection

机译：通过轨迹引导的样本选择合成真实照片的讲话头

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose an HMM trajectory-guided, real image sample concatenation approach to photo-real talking head synthesis. It renders a smooth and natural video of articulators in sync with given speech signals. An audio-visual database is used to train a statistical Hidden Markov Model (HMM) of lips movement first and the trained model is then used to generate a visual parameter trajectory of lips movement for given speech signals, all in the maximum likelihood sense. The HMM generated trajectory is then used as a guide to select, in the original training database, an optimal sequence of mouth images which are then stitched back to a background head video. The whole procedure is fully automatic and data driven. With an audio/video footage as short as 20 minutes from a speaker, the proposed system can synthesize a highly photo-real video in sync with the given speech signals. This system won the FIRST place in the Audio-Visual match contest in LIPS2009 Challenge, which was perceptually evaluated by recruited human subjects.

机译：在本文中，我们提出了一种HMM轨迹引导的真实图像样本级联方法，以实现照片真实的说话人头部合成。它与给定的语音信号同步，呈现清晰清晰的咬合器视频。首先使用视听数据库来训练嘴唇运动的统计隐马尔可夫模型（HMM），然后使用训练后的模型为给定的语音信号生成嘴唇运动的视觉参数轨迹，所有这些都是最大似然感。然后，将HMM生成的轨迹用作指导，以在原始训练数据库中选择最佳的嘴巴图像序列，然后将其缝合回背景头部视频。整个过程是全自动的，并由数据驱动。借助距扬声器仅20分钟的音频/视频镜头，该系统可以与给定的语音信号同步合成高度照片逼真的视频。该系统在LIPS2009挑战赛的视听比赛中赢得了第一名，该奖项由新招募的人类受试者进行了感官评估。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2010》|2011年|p.446-449|共4页
会议地点
作者
Lijuan Wang; Xiaojun Qian; Wei Han; Frank K. Soong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
visual speech synthesis; photo-real; talking head; trajectory-guided;

机译：视觉语音合成;真实照片说话的头轨迹引导;

相似文献

外文文献
中文文献
专利

1. HMM trajectory-guided sample selection for photo-realistic talking head [J] . Wang Lijuan, Soong Frank K. Multimedia Tools and Applications . 2015,第22期

机译：HMM轨迹引导的用于真实感谈话头的样本选择
2. Photo-realistic talking-heads from image samples [J] . Cosatto E., Graf H.P. IEEE transactions on multimedia . 2000,第3期

机译：来自图像样本的逼真的谈话头
3. Synthesized Cluster Head Selection and Routing for Two Tier Wireless Sensor Network [J] . Keyur Rana, Mukesh Zaveri Journal of computer networks and communications . 2013,第期

机译：两层无线传感器网络的综合簇头选择和路由
4. Synthesizing Photo-Real Talking Head via Trajectory-Guided Sample Selection [C] . Lijuan Wang, Xiaojun Qian, Wei Han, Annual conference of the International Speech Communication Association . 2010

机译：通过轨迹引导样品选择综合照片实谈的头部
5. Pairing media-captured human versus computer-synthesized humanoid faces and voices for talking heads: A consistency theory for interface agents. [D] . Gong, Li. 2001

机译：将媒体捕获的人与计算机合成的人形面部和声音配对以用于说话人：接口代理的一致性理论。
6. Estimates of diet selection in cattle grazing cornstalk residues by measurement of chemical composition and near infrared reflectance spectroscopy of diet samples collected by ruminal evacuation [O] . Emily A Petzel, Alexander J Smart, Benoit St-Pierre, 2018

机译：通过测量瘤胃抽空收集的饲料样品的化学成分和近红外反射光谱法估计吃草玉米秸秆残留物的牛的饲料选择
7. Rendering A Personalized Photo-Real Talking Head from Short Video Footage [O] . Lijuan Wang, Wei Han, Xiaojun Qian, 2013

机译：从短视频画面渲染个性化的照片真实说话头

Synthesizing Photo-Real Talking Head via Trajectory-Guided Sample Selection

摘要

著录项

相似文献

相关主题

期刊订阅