High-Fidelity Facial and Speech Animation for VR HMDs

Kyle Olszewski; Joseph J. Lim; Shunsuke Saito; Hao Li

首页> 外文期刊>ACM Transactions on Graphics >High-Fidelity Facial and Speech Animation for VR HMDs

【24h】

High-Fidelity Facial and Speech Animation for VR HMDs

机译：VR HMD的高保真面部和语音动画

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Significant challenges currently prohibit expressive interaction in virtualrnreality (VR). Occlusions introduced by head-mounted displaysrn(HMDs) make existing facial tracking techniques intractable, andrneven state-of-the-art techniques used for real-time facial tracking inrnunconstrained environments fail to capture subtle details of the user’srnfacial expressions that are essential for compelling speech animation.rnWe introduce a novel system for HMD users to control a digitalrnavatar in real-time while producing plausible speech animation andrnemotional expressions. Using a monocular camera attached to anrnHMD, we record multiple subjects performing various facial expressionsrnand speaking several phonetically-balanced sentences. Thesernimages are used with artist-generated animation data correspondingrnto these sequences to train a convolutional neural network (CNN)rnto regress images of a user’s mouth region to the parameters thatrncontrol a digital avatar. To make training this system more tractable,rnwe use audio-based alignment techniques to map images of multiplernusers making the same utterance to the corresponding animationrnparameters. We demonstrate that this approach is also feasible forrntracking the expressions around the user’s eye region with an internalrninfrared (IR) camera, thereby enabling full facial tracking. Thisrnsystem requires no user-specific calibration, uses easily obtainablernconsumer hardware, and produces high-quality animations of speechrnand emotional expressions. Finally, we demonstrate the quality ofrnour system on a variety of subjects and evaluate its performancernagainst state-of-the-art real-time facial tracking techniques.

机译：当前，重大挑战禁止虚拟现实（VR）中的表达交互。头戴式显示器（HMD）引入的遮挡使现有的面部跟踪技术变得难以处理，并且即使在不受约束的环境下用于实时面部跟踪的最新技术也无法捕捉到用户面部表情的细微细节，而这些细节对于引人入胜的语音动画。我们为HMD用户引入了一种新颖的系统，可以实时控制数字化身，同时产生合理的语音动画和情感表达。使用连接到anrnHMD的单眼相机，我们记录了执行各种面部表情的多个对象，并讲了几个语音平衡的句子。这些图像与与这些序列相对应的艺术家生成的动画数据一起使用，以训练卷积神经网络（CNN）以将用户口部区域的图像回归到控制数字化身的参数。为了使该系统的训练更容易处理，我们使用基于音频的对齐技术将发声相同的多个用户的图像映射到相应的动画参数。我们证明了这种方法对于使用内置红外（IR）摄像机跟踪用户眼睛区域周围的表情也是可行的，从而可以进行全面的面部跟踪。该系统不需要特定于用户的校准，使用易于获得的消费硬件，并可以生成高质量的语音和情感表达动画。最后，我们在各种主题上演示了nour系统的质量，并针对最先进的实时面部跟踪技术评估了其性能。

著录项

来源
《ACM Transactions on Graphics》 |2016年第6cd期|221.1-221.14|共14页
作者
Kyle Olszewski; Joseph J. Lim; Shunsuke Saito; Hao Li;
展开▼
作者单位

University of Southern California;

Stanford University;

Pinscreen;

University of Southern California Pinscreen USC Institute for Creative Technologies;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词
real-time facial performance capture; virtual reality; communication; speech animation; eye tracking; head-mounted display;

机译：实时面部表情捕捉;虚拟现实;通讯;语音动画;眼动追踪;头戴式显示器;

相似文献

外文文献
中文文献
专利

1. Animated Lombard speech: Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions [J] . Simon Alexanderson, Jonas Beskow Computer speech and language . 2014,第2期

机译：伦巴第动画语音：在不利条件下产生的动作捕捉，面部动画和语音的视觉清晰度
2. SynFace-Speech-Driven Facial Animation for Virtual Speech-Reading Support [J] . Giampiero Salvi, rnJonas Beskow, rnSamer Al Moubayed, EURASIP journal on audio, speech, and music processing . 2009,第suppla期

机译：SynFace语音驱动的面部动画，支持虚拟语音阅读
3. SynFace—Speech-Driven Facial Animation for Virtual Speech-Reading Support [J] . Giampiero Salvi, Jonas Beskow, Samer Al Moubayed, EURASIP journal on audio, speech, and music processing . 2009,第1期

机译：SynFace-语音驱动的面部动画，支持虚拟语音阅读
4. Extracting Emotion from Speech: Towards Emotional Speech-Driven Facial Animations [C] . Olusola Olumide Aina, Knut Hartmann, Thomas Strothotte Smart Graphics . 2003

机译：从语音中提取情感：走向情感语音驱动的面部动画
5. A facial animation model for expressive audio-visual speech. [D] . Somasundaram, Arunachalam. 2006

机译：用于表达视听语音的面部动画模型。
6. Real-Time Eyeblink Detector and Eye State Classifier for Virtual Reality (VR) Headsets (Head-Mounted Displays HMDs) [O] . Nassr Alsaeedi, Dieter Wloka 2019

机译：用于虚拟现实（VR）耳机（头戴式显示器HMD）的实时眨眼检测器和眼睛状态分类器
7. Can Local Avatars Satisfy A Global Audience? A Case Study of High-Fidelity 3D Facial Avatar Animation in Subject Identification and Emotion Perception by US and International Groups [O] . Chang Yun, Zhigang Deng, Merrill Hiscock 2010

机译：本地头像可以满足全球观众的需求吗？美国和国际团体在主题识别和情感感知中的高保真3D头像虚拟化案例研究

High-Fidelity Facial and Speech Animation for VR HMDs

摘要

著录项

相似文献

相关主题

期刊订阅