Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

Jesse Engel; Cinjon Resnick; Adam Roberts; Sander Dieleman; Mohammad Norouzi; Douglas Eck; Karen Simonyan

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

【24h】

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

机译：WaveNet自动编码器对音符进行神经音频合成

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of high-quality image datasets. In this paper, we offer contributions in both these areas to enable similar progress in audio modeling. First, we detail a powerful new WaveNet-style autoencoder model that conditions an autoregressive decoder on temporal codes learned from the raw audio waveform. Second, we introduce NSynth, a large-scale and high-quality dataset of musical notes that is an order of magnitude larger than comparable public datasets. Using NSynth, we demonstrate improved qualitative and quantitative performance of the WaveNet autoencoder over a well-tuned spectral autoencoder baseline. Finally, we show that the model learns a manifold of embeddings that allows for morphing between instruments, meaningfully interpolating in timbre to create new types of sounds that are realistic and expressive.

机译：由于算法的改进和高质量图像数据集的可用性，视觉生成模型已经取得了快速进展。在本文中，我们在这两个方面都做出了贡献，以实现音频建模方面的类似进展。首先，我们详细介绍了功能强大的WaveNet样式的新自动编码器模型，该模型根据从原始音频波形中学到的时间码来调节自动回归解码器。其次，我们介绍了NSynth，这是一个大规模，高质量的音符数据集，比可比较的公共数据集大一个数量级。使用NSynth，我们展示了WaveNet自动编码器在经过良好调整的频谱自动编码器基线上的定性和定量性能。最后，我们表明该模型学习了多种嵌入，可以在乐器之间进行变形，对音色进行有意义的内插，以创建逼真的和富有表现力的新型声音。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2017年第3期|共10页
作者
Jesse Engel; Cinjon Resnick; Adam Roberts; Sander Dieleman; Mohammad Norouzi; Douglas Eck; Karen Simonyan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Detection of Bleeding Events in Electronic Health Record Notes Using Convolutional Neural Network Models Enhanced With Recurrent Neural Network Autoencoders: Deep Learning Approach [J] . Rumeng Li, Baotian Hu, Feifan Liu, JMIR Medical Informatics . 2019,第1期

机译：使用循环神经网络自动编码器增强的卷积神经网络模型检测电子病历中的出血事件：深度学习方法
2. Synthesizing Talking Faces from Text and Audio: An Autoencoder and Sequence-to-Sequence Convolutional Neural Network [J] . Pattern Recognition: The Journal of the Pattern Recognition Society . 2020,第期

机译：从文本和音频合成谈话面部：AutoEncoder和序列到序列卷积神经网络
3. PureMIC: A New Audio Dataset for the Classification of Musical Instruments based on Convolutional Neural Networks [J] . Castel-Branco Goncalo, Falcao Gabriel, Perdigao Fernando Journal of signal processing systems for signal, image, and video technology . 2021,第9期

机译：威胁：基于卷积神经网络的乐器分类的新音频数据集
4. Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders [C] . Jesse Engel, Cinjon Resnick, Adam Roberts, International Conference on Machine Learning . 2018

机译：Wavenet Automencoders的神经音频合成音符
5. Dictionary-Based Analysis/Synthesis and Structured Representations of Musical Audio. [D] . Boyes, Graham. 2012

机译：基于字典的音乐音频分析/合成和结构化表示。
6. From Notes to Vowels: Neural Correlations between Musical Training and Speech Processing [O] . Iliza M. Butera 2015

机译：从音符到元音：音乐训练和语音处理之间的神经相关性
7. Detection of Bleeding Events in Electronic Health Record Notes Using Convolutional Neural Network Models Enhanced With Recurrent Neural Network Autoencoders: Deep Learning Approach [O] . Rumeng Li, Baotian Hu, Feifan Liu, 2019

机译：用经常性神经网络自动化器增强卷积神经网络模型检测电子健康记录笔记中的出血事件：深度学习方法

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

摘要

著录项

相似文献

相关主题

期刊订阅