首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
【24h】

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

机译:WaveNet自动编码器对音符进行神经音频合成

获取原文
           

摘要

Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of high-quality image datasets. In this paper, we offer contributions in both these areas to enable similar progress in audio modeling. First, we detail a powerful new WaveNet-style autoencoder model that conditions an autoregressive decoder on temporal codes learned from the raw audio waveform. Second, we introduce NSynth, a large-scale and high-quality dataset of musical notes that is an order of magnitude larger than comparable public datasets. Using NSynth, we demonstrate improved qualitative and quantitative performance of the WaveNet autoencoder over a well-tuned spectral autoencoder baseline. Finally, we show that the model learns a manifold of embeddings that allows for morphing between instruments, meaningfully interpolating in timbre to create new types of sounds that are realistic and expressive.
机译:由于算法的改进和高质量图像数据集的可用性,视觉生成模型已经取得了快速进展。在本文中,我们在这两个方面都做出了贡献,以实现音频建模方面的类似进展。首先,我们详细介绍了功能强大的WaveNet样式的新自动编码器模型,该模型根据从原始音频波形中学到的时间码来调节自动回归解码器。其次,我们介绍了NSynth,这是一个大规模,高质量的音符数据集,比可比较的公共数据集大一个数量级。使用NSynth,我们展示了WaveNet自动编码器在经过良好调整的频谱自动编码器基线上的定性和定量性能。最后,我们表明该模型学习了多种嵌入,可以在乐器之间进行变形,对音色进行有意义的内插,以创建逼真的和富有表现力的新型声音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号