Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation

Stylianos Ioannis Mimilakis; Konstantinos Drossos; Estefanía Cano; Gerald Schuller

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation

【24h】

Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation

机译：检查去噪AutoEncoders在唱歌语音分离中的映射功能

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The goal of this article is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder (DAE) model that are conditioned on the mixture magnitude spectra. To approximate the mapping functions, we propose an algorithm inspired by the knowledge distillation, denoted the neural couplings algorithm (NCA). The NCA yields a matrix that expresses the mapping of the mixture to the target source magnitude information. Using the NCA, we examine the mapping functions of three fundamental DAE-based models in music source separation; one with single-layer encoder and decoder, one with multi-layer encoder and single-layer decoder, and one using skip-filtering connections (SF) with a single-layer encoding and decoding. We first train these models with realistic data to estimate the singing voice magnitude spectra from the corresponding mixture. We then use the optimized models and test spectral data as input to the NCA. Our experimental findings show that approaches based on the DAE model learn scalar filtering operators, exhibiting a predominant diagonal structure in their corresponding mapping functions, limiting the exploitation of inter-frequency structure of music data. In contrast, skip-filtering connections are shown to assist the DAE model in learning filtering operators that exploit richer inter-frequency structures.

机译：本文的目标是调查基于神经网络的歌唱语音分离方法从数据中学习。我们基于在混合幅度谱上的去噪AutoEncoder（DAE）模型来检查神经网络的映射功能。为了近似映射函数，我们提出了一种受知识蒸馏的启发的算法，表示神经耦合算法（NCA）。 NCA产生矩阵，其表示混合的映射到目标源幅度信息。使用NCA，我们检查音乐源分离中三个基于DAE的模型的映射函数;一个具有单层编码器和解码器，一个具有多层编码器和单层解码器，以及使用跳过滤波连接（SF），具有单层编码和解码。我们首先用现实的数据训练这些模型，以从相应的混合物估计唱歌语音幅度谱。然后，我们使用优化的模型和测试光谱数据作为输入到NCA。我们的实验结果表明，基于DAE模型的方法学习标量滤波运算符，在其相应的映射功能中展示了主要的对角线结构，限制了音乐数据的频率间结构的利用。相反，显示跳过过滤连接，以帮助DAE模型在学习频率间频率间结构的滤波运算符中。

著录项

来源
《Audio, Speech, and Language Processing, IEEE/ACM Transactions on》 |2020年第2020期|266-278|共13页
作者
Stylianos Ioannis Mimilakis; Konstantinos Drossos; Estefanía Cano; Gerald Schuller;
展开▼
作者单位

Semantic Music Technologies Group Fraunhofer IDMT Ilmenau Germany;

Audio Research Group Tampere University Tampere Finland;

Semantic Music Technologies Group Fraunhofer IDMT Ilmenau Germany;

Technical University of Ilmenau Germany;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Source separation; Computational modeling; Noise reduction; Neural networks; Approximation algorithms; Multiple signal classification; Decoding;

机译：源分离;计算建模;降噪;神经网络;近似算法;多个信号分类;解码;

相似文献

外文文献
中文文献
专利

1. Singing voice separation with pre-learned dictionary and reconstructed voice spectrogram [J] . Neural computing & applications . 2020,第8期

机译：使用预先学习的字典和重建语音谱图来唱歌语音分离
2. Contemporary Commercial Music Singing Students-Voice Quality and Vocal Function at the Beginning of Singing Training [J] . Sielska-Badurek Ewelina M., Sobol Maria, Olszowska Katarzyna, Journal of voice: official journal of the Voice Foundation . 2018,第6期

机译：当代商业音乐唱歌学生 - 语音质量和声乐功能在歌唱训练开始时
3. Singing voice outcomes following singing voice therapy [J] . Dastolfo-Hromack Christina, Thomas Tracey L., Rosen Clark A., The Laryngoscope: A Medical Journal for Clinical and Research Contributions in Otolaryngology, Head and Neck Medicine and Surgery, Facial Plastic and Reconstructive Surgery .. . 2016,第11期

机译：唱歌语音治疗后唱歌语音结果
4. Data Augmentation for Monaural Singing Voice Separation Based on Variational Autoencoder-Generative Adversarial Network [C] . Boxin He, Shengbei Wang, Weitao Yuan, IEEE International Conference on Multimedia and Expo . 2019

机译：基于变分自编码-生成对抗网络的单声道歌声分离数据增强
5. 雑音除去自己符号化器を用いた多様なテキストからの敵対的サンプル生成手法 : Generating Adversarial Examples from Diverse Text using Denoising Autoencoder利用統計を見る [D] . 保田和彦 2019

机译：使用降噪自动编码器从多种文本生成对抗性示例，请参阅用法统计信息
6. Finding your voice: A singing lesson from functional imaging [O] . Sarah J. Wilson, David F. Abbott, Dean Lusher, 2011

机译：找到你的声音：功能影像的唱歌课
7. Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation [O] . Stylianos Ioannis Mimilakis, Konstantinos Drossos, Estefania Cano, 2020

机译：检查去噪AutoEncoders在唱歌语音分离中的映射功能

Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation

摘要

著录项

相似文献

相关主题

期刊订阅