首页> 外文期刊>Image and Vision Computing >Transformer models for enhancing AttnGAN based text to image generation
【24h】

Transformer models for enhancing AttnGAN based text to image generation

机译:将基于ATTNGAN的文本增强到图像生成的变压器模型

获取原文
获取原文并翻译 | 示例
           

摘要

Deep neural networks are capable of producing photographic images that depict given natural language text descriptions. Such models have huge potential in applications such as interior designing, video games, editing and facial sketching for digital forensics. However, only a limited number of methods in the literature have been developed for text to image (TTI) generation. Most of them use Generative Adversarial Networks (GAN) based deep learning methods. Attentional GAN (AttnGAN) is a popular GAN based TTI method that extracts meaningful information from the given text descriptions using attention mechanism. In this paper, we investigate the use of different Transformer models such as BERT, GPT2, XLNet with AttnGAN to solve the challenge of extracting semantic information from the text descriptions. Hence, the proposed AttnGAN(TRANS) architecture has three variants AttnGAN(BERT), AttnGAN(XL) and AttnGAN(GPT). The proposed method is successful over the conventional AttnGAN and gives a boosted inception score by 27.23% and a decline of Frechet inception distance by 49.9%. The results in our experiments indicate that the proposed method has the potential to outperform the contemporary state-of-the-art methods and validate the use of Transformer models in improving the performance of TTI generation. The code is made publicly available at https://github.com/sairamkiran9/AttnGAN-trans. (C) 2021 Elsevier B.V. All rights reserved.
机译:深度神经网络能够产生描绘给定自然语言文本描述的摄影图像。这些模型具有巨大的应用潜力,例如室内设计,视频游戏,编辑和用于数字取证的面部素描。然而,只有有限数量的文献中的方法是为图像(TTI)生成的文本而开发的。其中大多数都使用基于生成的对抗网络(GaN)的深度学习方法。注意力甘(Attngan)是一种基于GAN的流行GaN方法,可以使用注意机制从给定的文本描述中提取有意义的信息。在本文中,我们调查了不同变压器模型,如BERT,GPT2,XLNET与ATTNGAN来解决从文本描述中提取语义信息的挑战。因此,所提出的Attngan(Trans)建筑有三个变体Attngan(BERT),Attngan(XL)和Attngan(GPT)。该方法在传统的Attngan方面取得了成功,并提高了27.23%的增强成立得分,而Freechet成立距离下降49.9%。我们的实验中的结果表明,该方法有可能优势优于现代最先进的方法,并验证变压器模型在提高TTI生成的性能方面的使用。该代码在HTTPS://github.com/sairamkiran9/ATTNAGAN-TRANS上公开提供。 (c)2021 elestvier b.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号