首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Towards Audio to Scene Image Synthesis Using Generative Adversarial Network
【24h】

Towards Audio to Scene Image Synthesis Using Generative Adversarial Network

机译:使用生成对抗网络实现音频到场景图像合成

获取原文

摘要

Humans can imagine a scene from a sound. We want machines to do so by using conditional generative adversarial networks (GANs). By applying the techniques including spectral norm, projection discriminator and auxiliary classifier, compared with naive conditional GAN, the model can generate images with better quality in terms of both subjective and objective evaluations. Almost three-fourth of people agree that our model have the ability to generate images related to sounds. By inputting different volumes of the same sound, our model output different scales of changes based on the volumes, showing that our model truly knows the relationship between sounds and images to some extent.
机译:人类可以从声音中想象出一个场景。我们希望机器通过使用条件生成对抗网络(GAN)来做到这一点。通过应用包括频谱范数,投影鉴别器和辅助分类器在内的技术,与朴素的条件GAN相比,该模型可以在主观和客观评估方面生成质量更高的图像。几乎四分之三的人同意我们的模型具有生成与声音相关的图像的能力。通过输入相同声音的不同音量,我们的模型基于音量输出不同的变化比例,这表明我们的模型在一定程度上真正了解了声音和图像之间的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号