首页> 外文学位 >Data-driven facial animation synthesis by learning from facial motion capture data.
【24h】

Data-driven facial animation synthesis by learning from facial motion capture data.

机译:通过从面部运动捕捉数据中学习来进行数据驱动的面部动画合成。

获取原文
获取原文并翻译 | 示例

摘要

Synthesizing realistic facial animation remains one of the most challenging topics in the graphics community because of the complexity of deformation of a moving face and our inherent sensitivity to the subtleties of human facial motion. The central goal of this dissertation is to attempt data-driven facial animation synthesis that captures the dynamics, naturalness, and personality of facial motion while human subjects are speaking with emotions. The solution is to synthesize realistic 3D talking faces by learning from facial motion capture data. This dissertation addresses three critical parts of realistic talking face synthesis: realistic eye motion synthesis, natural head motion synthesis, and expressive speech animation synthesis.; A texture-synthesis based approach is presented to simultaneously synthesize realistic eye gaze and blink motion, accounting for any possible correlations between the two. The quality of statistical modeling and the introduction of gaze-eyelid coupling are improvement over previous work, and the synthesized eye results are hard to distinguish from actual captured eye motion.; Two different approaches (sample-based and model-based) are presented to synthesize appropriate head motion. Based on the aligned training pairs between audio features and head motion, the sample-based approach uses a K-nearest neighbors-based dynamic programming algorithm to search for the optimal head motion samples given novel speech input. The model-based approach uses the Hidden Markov Models (HMMs) to synthesize natural head motion. HMMs are trained to capture the temporal relation between the acoustic prosodic features and head motion.; This dissertation also presents two different approaches (model-based and sample-based) to generate novel expressive speech animation given new speech input. The model-based synthesis approach accurately learns speech co-articulation models and expression eigen spaces from facial motion data, and then it synthesizes novel expressive speech animations by applying these generative co-articulation models and sampling from the constructed expression eigen spaces. The sample-based synthesis system (eFASE) automatically generates expressive speech animation by concatenating captured facial motion frames while animators establish constraints and goals (novel phoneme-aligned speech input and its emotion modifiers). Users can also edit the processed facial motion database via a novel phoneme-Isomap interface.
机译:合成逼真的面部动画仍然是图形社区中最具挑战性的主题之一,这是因为运动的面部变形的复杂性以及我们对人类面部运动微妙之处的固有敏感性。本文的主要目标是尝试以数据为驱动的面部动画合成方法,以捕捉人类受试者说话时面部动作的动态性,自然性和个性。解决方案是通过从面部运动捕捉数据中学习来合成逼真的3D说话面孔。本文论述了现实说话脸合成的三个关键部分:现实眼动合成,自然头部动作合成和表达性语音动画合成。提出了一种基于纹理合成的方法,可以同时合成逼真的眼睛凝视和眨眼动作,并考虑到两者之间可能存在的相关性。统计建模的质量和注视-眼睑耦合的引入比以前的工作有所改善,并且合成的眼睛结果很难与实际捕获的眼睛运动区分开。提出了两种不同的方法(基于样本和基于模型)来合成适当的头部运动。基于音频特征和头部运动之间的对齐训练对,基于样本的方法使用基于K近邻的动态规划算法来搜索给定新颖语音输入的最佳头部运动样本。基于模型的方法使用隐马尔可夫模型(HMM)来合成自然的头部运动。 HMM被训练为捕获声学韵律特征和头部运动之间的时间关系。本文还提出了两种不同的方法(基于模型和基于样本),以在给定新语音输入的情况下生成新颖的表达性语音动画。基于模型的合成方法可以从面部运动数据中准确学习语音共发音模型和表达特征空间,然后通过应用这些生成共发音模型并从构建的表达特征空间中进行采样来合成新颖的表达语音动画。基于样本的合成系统(eFASE)通过在动画师建立约束和目标(新颖音素对齐的语音输入及其情感修饰符)时连接捕获的面部运动帧来自动生成表达性语音动画。用户还可以通过新颖的音素-Isomap界面编辑处理后的面部运动数据库。

著录项

  • 作者

    Deng, Zhigang.;

  • 作者单位

    University of Southern California.;

  • 授予单位 University of Southern California.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 146 p.
  • 总页数 146
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

  • 入库时间 2022-08-17 11:40:58

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号