首页> 外文会议>International Conference on Pattern Recognition >Facial Attribute Editing by Latent Space Adversarial Variational Autoencoders
【24h】

Facial Attribute Editing by Latent Space Adversarial Variational Autoencoders

机译:潜在空间对抗性变分自编码器的面部属性编辑

获取原文

摘要

This work focuses on the problem of editing facial images by manipulating specified attributes of interest. To learn latent representations disentangled with respect to specified face attribute, a novel attribute-disentangled generative model is proposed by combining variational autoencoders (VAEs) and generative adversarial networks (GANs). In the proposed model, only two deep mappings are included: an encoder and a decoder, similarly as the counterparts in the context of VAEs. Latent space mapped by the encoder is split into two parts: style space and attribute space. The former represents attribute-irrelevant factors, such as identity, position, illumination and background, etc. The latter represents the attributes, such as hair color, gender, with or without glasses, etc, of which each dimension represents one single attribute. By regarding constraints on the output of the encoder as discriminative objectives, the encoder can act not only as a discriminator that is expected to discriminate a sample is a real or a generated one, but also as an attribute classifier that can discriminate whether a sample has the specified attributes or not. Combining reconstruction and Kullback-Leibler (KL) divergence regularization losses like in VAEs, the adversarial training loss defined for the style and the attribute in the latent space is introduced, which drives the proposed model to generate images whose distribution are close to the real data distribution in the latent space. Finally, the model was evaluated on the CelebA dataset and experimental results showed its effectiveness in disentangling face attributes and generating high-quality face images.
机译:这项工作着重于通过操纵指定的感兴趣属性来编辑面部图像的问题。为了学习相对于指定面部属性解纠缠的潜在表示,通过结合变分自动编码器(VAE)和生成对抗网络(GAN)提出了一种新颖的属性解纠缠生成模型。在提出的模型中,仅包括两个深度映射:编码器和解码器,与VAE上下文中的对应关系类似。编码器映射的潜在空间分为两部分:样式空间和属性空间。前者表示属性无关的因素,例如身份,位置,照明和背景等。后者表示属性,例如头发的颜色,性别,戴或不戴眼镜等,其中每个维度代表一个单独的属性。通过将对编码器输出的约束作为判别目标,编码器不仅可以充当预期辨别样本是真实样本还是生成样本的辨别器,而且还可以充当辨别样本是否具有样本的属性分类器。是否指定属性。结合VAE中的重建和Kullback-Leibler(KL)散度正则化损失,引入了针对潜在空间中的样式和属性定义的对抗训练损失,这驱使所提出的模型生成分布接近真实数据的图像分布在潜在空间中。最后,该模型在CelebA数据集上进行了评估,实验结果表明该模型可有效区分面部特征并生成高质量的面部图像。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号