...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation
【24h】

Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation

机译:检查去噪AutoEncoders在唱歌语音分离中的映射功能

获取原文
获取原文并翻译 | 示例
           

摘要

The goal of this article is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder (DAE) model that are conditioned on the mixture magnitude spectra. To approximate the mapping functions, we propose an algorithm inspired by the knowledge distillation, denoted the neural couplings algorithm (NCA). The NCA yields a matrix that expresses the mapping of the mixture to the target source magnitude information. Using the NCA, we examine the mapping functions of three fundamental DAE-based models in music source separation; one with single-layer encoder and decoder, one with multi-layer encoder and single-layer decoder, and one using skip-filtering connections (SF) with a single-layer encoding and decoding. We first train these models with realistic data to estimate the singing voice magnitude spectra from the corresponding mixture. We then use the optimized models and test spectral data as input to the NCA. Our experimental findings show that approaches based on the DAE model learn scalar filtering operators, exhibiting a predominant diagonal structure in their corresponding mapping functions, limiting the exploitation of inter-frequency structure of music data. In contrast, skip-filtering connections are shown to assist the DAE model in learning filtering operators that exploit richer inter-frequency structures.
机译:本文的目标是调查基于神经网络的歌唱语音分离方法从数据中学习。我们基于在混合幅度谱上的去噪AutoEncoder(DAE)模型来检查神经网络的映射功能。为了近似映射函数,我们提出了一种受知识蒸馏的启发的算法,表示神经耦合算法(NCA)。 NCA产生矩阵,其表示混合的映射到目标源幅度信息。使用NCA,我们检查音乐源分离中三个基于DAE的模型的映射函数;一个具有单层编码器和解码器,一个具有多层编码器和单层解码器,以及使用跳过滤波连接(SF),具有单层编码和解码。我们首先用现实的数据训练这些模型,以从相应的混合物估计唱歌语音幅度谱。然后,我们使用优化的模型和测试光谱数据作为输入到NCA。我们的实验结果表明,基于DAE模型的方法学习标量滤波运算符,在其相应的映射功能中展示了主要的对角线结构,限制了音乐数据的频率间结构的利用。相反,显示跳过过滤连接,以帮助DAE模型在学习频率间频率间结构的滤波运算符中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号