首页> 外文会议>European conference on computer vision >Real-Time Facial Segmentation and Performance Capture from RGB Input
【24h】

Real-Time Facial Segmentation and Performance Capture from RGB Input

机译:从RGB输入实时面部分割和性能捕捉

获取原文

摘要

We introduce the concept of unconstrained real-time 3D facial performance capture through explicit semantic segmentation in the RGB input. To ensure robustness, cutting edge supervised learning approaches rely on large training datasets of face images captured in the wild. While impressive tracking quality has been demonstrated for faces that are largely visible, any occlusion due to hair, accessories, or hand-to-face gestures would result in significant visual artifacts and loss of tracking accuracy. The modeling of occlusions has been mostly avoided due to its immense space of appearance variability. To address this curse of high dimensionality, we perform tracking in unconstrained images assuming non-face regions can be fully masked out. Along with recent breakthroughs in deep learning, we demonstrate that pixel-level facial segmentation is possible in real-time by repurposing convolutional neural networks designed originally for general semantic segmentation. We develop an efficient architecture based on a two-stream deconvolution network with complementary characteristics, and introduce carefully designed training samples and data augmentation strategies for improved segmentation accuracy and robustness. We adopt a state-of-the-art regression-based facial tracking framework with segmented face images as training, and demonstrate accurate and uninterrupted facial performance capture in the presence of extreme occlusion and even side views. Furthermore, the resulting segmentation can be directly used to composite partial 3D face models on the input images and enable seamless facial manipulation tasks, such as virtual make-up or face replacement.
机译:我们通过在RGB输入中进行显式语义分割,引入了无约束的实时3D面部表情捕获的概念。为了确保鲁棒性,最先进的监督学习方法依赖于在野外捕获的面部图像的大型训练数据集。虽然已经证明了对于大部分可见的面部具有令人印象深刻的跟踪质量,但是由于头发,饰品或手面对面的手势而造成的任何遮挡都会导致明显的视觉伪像并降低跟踪精度。由于其巨大的外观可变性空间,大多数避免了对遮挡的建模。为了解决这种高维诅咒,我们假设可以完全掩盖非脸部区域,在不受约束的图像中执行跟踪。随着深度学习方面的最新突破,我们证明了通过重新使用最初为一般语义分割而设计的卷积神经网络,可以实时进行像素级面部分割。我们基于具有互补特性的两流反卷积网络开发了一种有效的体系结构,并介绍了经过精心设计的训练样本和数据扩充策略,以提高分割的准确性和鲁棒性。我们采用最先进的基于回归的面部跟踪框架,并以分割的面部图像作为训练,并在极端遮挡甚至侧面观察的情况下,演示了准确无间断的面部表情捕捉。此外,所得到的分割结果可以直接用于在输入图像上合成部分3D面部模型,并实现无缝的面部操作任务,例如虚拟化妆或面部替换。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号