Bridge-GAN: Interpretable Representation Learning for Text-to-Image Synthesis

Mingkuan Yuan; Yuxin Peng

首页> 外文期刊>Circuits and Systems for Video Technology, IEEE Transactions on >Bridge-GAN: Interpretable Representation Learning for Text-to-Image Synthesis

【24h】

Bridge-GAN: Interpretable Representation Learning for Text-to-Image Synthesis

机译：Bridge-GaN：文本到图像合成的可解释表示学习

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text-to-image synthesis is to generate images with the consistent content as the given text description, which is a highly challenging task with two main issues: visual reality and content consistency. Recently, it is available to generate images with high visual reality due to the significant progress of generative adversarial networks. However, translating text description to image with high content consistency is still ambitious. For addressing the above issues, it is reasonable to establish a transitional space with interpretable representation as a bridge to associate text and image. So we propose a text-to-image synthesis approach named Bridge-like Generative Adversarial Networks (Bridge-GAN). Its main contributions are: (1) A transitional space is established as a bridge for improving content consistency, where the interpretable representation can be learned by guaranteeing the key visual information from given text descriptions. (2) A ternary mutual information objective is designed for optimizing the transitional space and enhancing both the visual reality and content consistency. It is proposed under the goal to disentangle the latent factors conditioned on text description for further interpretable representation learning. Comprehensive experiments on two widely-used datasets verify the effectiveness of our Bridge-GAN with the best performance.

机译：文本到图像合成是生成具有一致内容的图像，作为给定的文本描述，这是一个具有两个主要问题的高度具有挑战性的任务：视觉现实和内容一致性。最近，由于生成的对抗性网络的显着进展，它可用于产生具有高视觉现实的图像。但是，将文本描述转换为具有高内容一致性的图像仍然雄心勃勃。为了解决上述问题，建立具有可解释表示的过渡空间是合理的，作为联合文本和图像的桥梁。因此，我们提出了一个名为Bridge Denerative Profersarial Networks（Bridge-GaN）的文本到图像综合方法。其主要贡献是：（1）过渡空间是建立为改善的桥梁内容一致性，可以通过保证来自给定文本描述的关键视觉信息来学习可解释的表示。（2）三元相互信息目标旨在优化过渡空间并增强两个视觉现实和内容一致性。在目标下提出了解开潜在因素的潜在因素，以便进一步解释的代表学习。两个广泛使用的数据集的综合实验验证了我们的桥GAN的有效性，具有最佳性能。

著录项

来源
《Circuits and Systems for Video Technology, IEEE Transactions on》 |2020年第11期|4258-4268|共11页
作者
Mingkuan Yuan; Yuxin Peng;
展开▼
作者单位

Wangxuan Institute of Computer Technology Peking University Beijing China;

Wangxuan Institute of Computer Technology Peking University Beijing China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Visualization; Mutual information; Image synthesis; Task analysis; Training; Bridge circuits; Semantics;

机译：可视化;相互信息;图像合成;任务分析;训练;桥接电路;语义;
入库时间 2022-08-18 20:58:21

相似文献

外文文献
中文文献
专利

1. Selective information enhancement learning for creating interpretable representations in competitive learning. [J] . Kamimura R Neural Networks: The Official Journal of the International Neural Network Society . 2011,第4期

机译：选择性信息增强学习，用于在竞争性学习中创建可解释的表示形式。
2. Selective information enhancement learning for creating interpretable representations in competitive learning. [J] . Kamimura R Neural Networks: The Official Journal of the International Neural Network Society . 2011,第4期

机译：选择性信息增强学习在竞争学习中创造可解释的陈述。
3. Selective information enhancement learning for creating interpretable representations in competitive learning. [J] . Kamimura R Neural Networks: The Official Journal of the International Neural Network Society . 2011,第4期

机译：选择性信息增强学习在竞争学习中创造可解释的陈述。
4. Interpretable Text-to-Image Synthesis with Hierarchical Semantic Layout Generation [C] . Seunghoon Hong, Dingdong Yang, Jongwook Choi, NIPS 2017 Workshop on Interpreting, Explaining and Visualizing Deep Learning ... now what? . 2019

机译：具有分层语义布局生成的可解释的文本到图像合成
5. Multimodal Representation Learning for Visual Reasoning and Text-to-Image Translation [D] . Saha, Rudra. 2018

机译：用于视觉推理和文本到图像翻译的多式联数表示
6. Information-Based Boundary Equilibrium Generative Adversarial Networks with Interpretable Representation Learning [O] . Junghoon Hah, Woojin Lee, Jaewook Lee, 2018

机译：具有可解释性表示学习的基于信息的边界均衡生成对抗网络
7. Adversarial Representation Learning for Text-to-Image Matching [O] . Nikolaos Sarafianos, Xiang Xu, Ioannis Kakadiaris 2019

机译：文本到图像匹配的对抗性代表

Bridge-GAN: Interpretable Representation Learning for Text-to-Image Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅