Real-Time Facial Segmentation and Performance Capture from RGB Input

机译：从RGB输入实时面部分割和性能捕捉

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce the concept of unconstrained real-time 3D facial performance capture through explicit semantic segmentation in the RGB input. To ensure robustness, cutting edge supervised learning approaches rely on large training datasets of face images captured in the wild. While impressive tracking quality has been demonstrated for faces that are largely visible, any occlusion due to hair, accessories, or hand-to-face gestures would result in significant visual artifacts and loss of tracking accuracy. The modeling of occlusions has been mostly avoided due to its immense space of appearance variability. To address this curse of high dimensionality, we perform tracking in unconstrained images assuming non-face regions can be fully masked out. Along with recent breakthroughs in deep learning, we demonstrate that pixel-level facial segmentation is possible in real-time by repurposing convolutional neural networks designed originally for general semantic segmentation. We develop an efficient architecture based on a two-stream deconvolution network with complementary characteristics, and introduce carefully designed training samples and data augmentation strategies for improved segmentation accuracy and robustness. We adopt a state-of-the-art regression-based facial tracking framework with segmented face images as training, and demonstrate accurate and uninterrupted facial performance capture in the presence of extreme occlusion and even side views. Furthermore, the resulting segmentation can be directly used to composite partial 3D face models on the input images and enable seamless facial manipulation tasks, such as virtual make-up or face replacement.

机译：我们通过在RGB输入中进行显式语义分割，引入了无约束的实时3D面部表情捕获的概念。为了确保鲁棒性，最先进的监督学习方法依赖于在野外捕获的面部图像的大型训练数据集。虽然已经证明了对于大部分可见的面部具有令人印象深刻的跟踪质量，但是由于头发，饰品或手面对面的手势而造成的任何遮挡都会导致明显的视觉伪像并降低跟踪精度。由于其巨大的外观可变性空间，大多数避免了对遮挡的建模。为了解决这种高维诅咒，我们假设可以完全掩盖非脸部区域，在不受约束的图像中执行跟踪。随着深度学习方面的最新突破，我们证明了通过重新使用最初为一般语义分割而设计的卷积神经网络，可以实时进行像素级面部分割。我们基于具有互补特性的两流反卷积网络开发了一种有效的体系结构，并介绍了经过精心设计的训练样本和数据扩充策略，以提高分割的准确性和鲁棒性。我们采用最先进的基于回归的面部跟踪框架，并以分割的面部图像作为训练，并在极端遮挡甚至侧面观察的情况下，演示了准确无间断的面部表情捕捉。此外，所得到的分割结果可以直接用于在输入图像上合成部分3D面部模型，并实现无缝的面部操作任务，例如虚拟化妆或面部替换。

著录项

来源
《European conference on computer vision》|2016年|244-261|共18页
会议地点
作者
Shunsuke Saito; Tianye Li; Hao Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Real-time facial performance capture; Face segmentation; Deep convolutional neural network; Regression;

机译：实时面部表情捕捉;人脸分割深度卷积神经网络回归;

相似文献

外文文献
中文文献
专利

1. Real-Time High-Fidelity Facial Performance Capture [J] . Chen Cao, Derek Bradley, Kun Zhou, ACM Transactions on Graphics . 2016,第4CD期

机译：实时高保真面部表现捕捉
2. Real-Time High-Fidelity Facial Performance Capture [J] . Chen Cao, Derek Bradley, Kun Zhou, ACM Transactions on Graphics . 2015,第4CD期

机译：实时高保真面部表现捕捉
3. Dynamic Facial Dataset Capture and Processing for Visual Speech Recognition using an RGB-D Sensor [J] . Naveed Ahmed, Mohammed Lataifeh, Imran Junejo IAENG Internaitonal journal of computer science . 2020,第4PTa2期

机译：使用RGB-D传感器的可视语音识别动态面部数据集捕获和处理
4. Real-Time Facial Segmentation and Performance Capture from RGB Input [C] . Shunsuke Saito, Tianye Li, Hao Li European Conference on Computer Vision . 2016

机译：RGB输入的实时面部分段和性能捕获
5. Real-Time Capture and Rendering of Physical Scene with an Efficiently Calibrated RGB-D Camera Network [D] . Su, Po-Chang. 2017

机译：通过高效校准的RGB-D摄像机网络实时捕获和渲染物理场景
6. Helping the Blind to Get through COVID-19: Social Distancing Assistant Using Real-Time Semantic Segmentation on RGB-D Video [O] . Manuel Martinez, Kailun Yang, Angela Constantinescu, 2020

机译：帮助盲人通过Covid-19：在RGB-D视频上使用实时语义分割来实现社交疏散助理
7. Real-Time Facial Segmentation and Performance Capture from RGB Input [O] . Saito, Shunsuke, Li, Tianye, Li, Hao 2016

机译：RGB输入的实时面部分割和性能捕获

Real-Time Facial Segmentation and Performance Capture from RGB Input

摘要

著录项

相似文献

相关主题

期刊订阅