首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Towards Audio to Scene Image Synthesis Using Generative Adversarial Network
【24h】

Towards Audio to Scene Image Synthesis Using Generative Adversarial Network

机译:利用生成对抗网络向音频到场景图像合成

获取原文

摘要

Humans can imagine a scene from a sound. We want machines to do so by using conditional generative adversarial networks (GANs). By applying the techniques including spectral norm, projection discriminator and auxiliary classifier, compared with naive conditional GAN, the model can generate images with better quality in terms of both subjective and objective evaluations. Almost three-fourth of people agree that our model have the ability to generate images related to sounds. By inputting different volumes of the same sound, our model output different scales of changes based on the volumes, showing that our model truly knows the relationship between sounds and images to some extent.
机译:人类可以从声音中想象一个场景。我们希望通过使用条件生成的对冲网络(GAN)来这样做。通过应用包括光谱规范,投影鉴别器和辅助分类器的技术,与天真条件GaN相比,该模型可以在主观和客观评估方面产生具有更好质量的图像。几乎四分之三的人同意我们的模型能够生成与声音相关的图像。通过输入相同声音的不同卷,我们的模型基于卷输出不同的更改尺度,显示我们的模型真正了解声音和图像之间的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号