首页> 外文学位 >A framework for automatic creation of talking heads for multimedia applications.

【24h】

A framework for automatic creation of talking heads for multimedia applications.

机译：自动创建多媒体应用程序的讲话头的框架。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this dissertation, a framework for automatic creation of talking heads for various multimedia applications is presented. In this framework, we present a new audio-to-visual conversion algorithm that uses a constrained optimization approach to take advantage of the dynamics of mouth movements. Based on facial muscle analysis, the dynamics of mouth movements is modeled and constraints are obtained from it. The obtained constraints are used to estimate visual parameters from speech in a framework of HMM-based visual parameter estimation. The proposed constrained optimization approach finds visual parameters that satisfy given constraints and maximize the auxiliary function that used to train audio-visual HMMs. This approach enables the algorithm to produce reliable visual parameters even in noisy environments. Experimental results demonstrate that the proposed audio-to-visual conversion method is able to follow true visual parameters robustly in various noisy environments. In addition to the constrained optimization approach for robust audio-to-visual conversion, an automatic scheme to create a 3D head model is presented. In this scheme a probabilistic approach, to decide whether or not extracted facial features are appropriate for creating a 3D face model, is presented. Automatically extracted 2D facial features from a video sequence are fed into the proposed probabilistic framework before a corresponding 3D face model is built to avoid generating an unnatural or non-realistic 3D face model. We also present a face shape extractor, based on an ellipse model controlled by three anchor points, which is accurate and computationally cheap. To create a 3D face model, a least-square approach is used to find a coefficient vector that is necessary to adapt a generic 3D model into extracted facial features. Experimental results show that the proposed scheme can efficiently build a 3D face model from a video sequence without any user intervention for various Internet applications including virtual conference and a virtual story teller that do not require much head movements or high quality facial animation.

机译：本文提出了一种为各种多媒体应用自动创建通话头的框架。在此框架中，我们提出了一种新的视听转换算法，该算法使用约束优化方法来利用嘴巴运动的动态。基于面部肌肉分析，对嘴部运动的动力学进行建模，并从中获得约束。在基于HMM的视觉参数估计框架中，将获得的约束用于从语音估计视觉参数。所提出的约束优化方法找到满足给定约束的视觉参数，并最大化用于训练视听HMM的辅助功能。这种方法使算法即使在嘈杂的环境中也能产生可靠的视觉参数。实验结果表明，所提出的视听转换方法能够在各种嘈杂环境中稳健地遵循真实的视觉参数。除了用于稳健的视听转换的约束优化方法之外，还提出了一种自动方案来创建3D头部模型。在该方案中，提出了一种概率方法，用于确定提取的面部特征是否适合于创建3D面部模型。从视频序列中自动提取的2D面部特征将被馈送到建议的概率框架中，然后建立相应的3D面部模型以避免生成不自然或不真实的3D面部模型。我们还提出了一种基于由三个锚点控制的椭圆模型的面部形状提取器，该提取器准确且计算便宜。为了创建3D人脸模型，使用最小二乘法来找到系数向量，该系数向量对于将通用3D模型适配到提取的人脸特征中是必需的。实验结果表明，该方案可以有效地从视频序列中构建3D人脸模型，而无需用户干预，无需各种头部动作或高质量面部动画的各种Internet应用，包括虚拟会议和虚拟故事讲述人。

著录项

作者
Choi, KyoungHo.;
展开▼
作者单位

University of Washington.;

展开▼
授予单位 University of Washington.;
学科 Engineering Electronics and Electrical.
学位 Ph.D.
年度 2002
页码 95 p.
总页数 95
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词
入库时间 2022-08-17 11:46:39

相似文献

外文文献
中文文献
专利

1. Automatic creation of a talking head from a video sequence [J] . Kyoung-Ho Choi, Jenq-Neng Hwang IEEE transactions on multimedia . 2005,第4期

机译：根据视频序列自动创建会说话的人
2. Advanced computational framework for the automatic analysis of the acetabular morphology from the pelvic bone surface for hip arthroplasty applications. [J] . Cerveri P, Marchente M, Chemello C, Annals of Biomedical Engineering: The Journal of the Biomedical Engineering Society . 2011,第11期

机译：先进的计算框架，可自动分析来自盆骨表面的髋臼形态，用于髋关节置换术。
3. C3PO: A Spontaneous and Ephemeral Social Networking Framework for a Collaborative Creation and Publishing of Multimedia Contents [J] . Frédérique Laforest, Nicolas Le Sommer, Stéphane Frénot, Procedia Computer Science . 2014,第1期

机译：C3PO：自发和短暂的社交网络框架，用于多媒体内容的协作创建和发布
4. Techniques for modelling and training multimedia expressive talking heads [C] . Karunaratne, S., Hong Yan . 2001

机译：建模和训练多媒体表情说话者的技术
5. Automatic capture and creation of presentations from multimedia lectures. [D] . Dickson, Paul E. 2008

机译：通过多媒体讲座自动捕获和创建演示文稿。
6. PLARIS: a Web Framework for Offering Automatically Classified Biomedical Multimedia Resources [O] . Luca Mazzola, Gianpiero Limongiello, Marco Masseroli, 2006

机译：POLARIS：用于提供自动分类的生物医学多媒体资源的Web框架
7. The Multimedia Reference Model: A Framework Facilitating the Creation of Multi-User, Multimedia Applications [O] . Stephan Abramowski, Karin Klabunde, Ursula Konrads, 1997

机译：多媒体参考模型：促进多媒体，多媒体应用程序的创建框架

A framework for automatic creation of talking heads for multimedia applications.

摘要

著录项

相似文献

相关主题

期刊订阅