首页> 外文期刊>Multimedia Tools and Applications >Audio style transfer using shallow convolutional networks and random filters
【24h】

Audio style transfer using shallow convolutional networks and random filters

机译:使用浅卷积网络和随机滤波器进行音频风格转移

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Recently, with the advent of Convolutional Neural Network (CNN) era, Neural style transfer on images has become a very active research topic and the style of an image can be transferred to another image through a CNN so that the image retains both its own content and another style of image. In this work, we propose an algorithm for audio style transfer that uses the force of CNN to generate a new audio from a style audio. We use Continuous Wavelet Transfer(CWT) to convert the audio into a spectrogram and then use the spectrogram as the representation of the audio image through image style transfer method to obtain a new image, and finally, generate an audio using iterative phase reconstruction with Griffin-Lim. We succeed in transferring audio such as light music but had difficulty in transferring audio that has lyrics and high-level metrics such as emotion or tone. We propose several measures to improve the quality of audio and a lot of experimental results shows that our method is better than other methods in terms of sound quality.
机译:最近,随着卷积神经网络(CNN)时代的出现,图像上的神经样式转移已成为一个非常活跃的研究主题,图像的样式可以通过CNN传送到另一图像,使得图像保留其自身内容和另一种形象。在这项工作中,我们提出了一种用于音频样式传输的算法,它使用CNN的力来生成来自样式音频的新音频。我们使用连续小波传输(CWT)将音频转换为频谱图,然后通过图像样式传输方法使用频谱图作为音频图像的表示,以获取新图像,最后,使用Griffin使用迭代相重建生成音频 - 我。我们成功地转移了轻松音乐等音频,但难以传输具有歌词和高级度量的音频,例如情感或音调。我们提出了几项措施来提高音频质量和许多实验结果表明,我们的方法比声音质量方面的方法更好。

著录项

  • 来源
    《Multimedia Tools and Applications》 |2020年第22期|15043-15057|共15页
  • 作者单位

    College of Information Science and Engineering Hunan University Changsha 410082 China Hunan Provincial Key Laboratory of Intelligent Information Processing and Application Hengyang Normal University Hengyang 421002 China;

    College of Information Science and Engineering Hunan University Changsha 410082 China;

    Hunan Provincial Key Laboratory of Intelligent Information Processing and Application Hengyang Normal University Hengyang 421002 China;

    Hunan Provincial Key Laboratory of Intelligent Information Processing and Application Hengyang Normal University Hengyang 421002 China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Audio style transfer; Continuous wavelet transfer; Deep neural network; Spectrogram;

    机译:音频样式转移;连续小波转移;深神经网络;谱图;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号