首页> 美国卫生研究院文献>Sensors (Basel Switzerland) >Unsupervised Learning for Monaural Source Separation Using Maximization–Minimization Algorithm with Time–Frequency Deconvolution
【2h】

Unsupervised Learning for Monaural Source Separation Using Maximization–Minimization Algorithm with Time–Frequency Deconvolution

机译:使用时频解卷积的最大化-最小化算法进行单声道源分离的无监督学习

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper presents an unsupervised learning algorithm for sparse nonnegative matrix factor time–frequency deconvolution with optimized fractional β-divergence. The β-divergence is a group of cost functions parametrized by a single parameter β. The Itakura–Saito divergence, Kullback–Leibler divergence and Least Square distance are special cases that correspond to β = 0,  1,  2, respectively. This paper presents a generalized algorithm that uses a flexible range of β that includes fractional values. It describes a maximization–minimization (MM) algorithm leading to the development of a fast convergence multiplicative update algorithm with guaranteed convergence. The proposed model operates in the time–frequency domain and decomposes an information-bearing matrix into two-dimensional deconvolution of factor matrices that represent the spectral dictionary and temporal codes. The deconvolution process has been optimized to yield sparse temporal codes through maximizing the likelihood of the observations. The paper also presents a method to estimate the fractional β value. The method is demonstrated on separating audio mixtures recorded from a single channel. The paper shows that the extraction of the spectral dictionary and temporal codes is significantly more efficient by using the proposed algorithm and subsequently leads to better source separation performance. Experimental tests and comparisons with other factorization methods have been conducted to verify its efficacy.
机译:本文提出了一种优化的分数β-散度的稀疏非负矩阵因子时频反卷积的无监督学习算法。 β散度是一组由单个参数β参数化的成本函数。 Itakura–Saito发散,Kullback–Leibler发散和最小二乘距离是特殊情况,分别对应于β= 0、1、2。本文提出了一种通用算法,该算法使用包含分数值的灵活范围的β。它描述了最大化-最小化(MM)算法,该算法导致了具有收敛保证的快速收敛乘法更新算法的发展。所提出的模型在时频域中运行,并将信息承载矩阵分解为代表频谱字典和时间码的因子矩阵的二维反卷积。去卷积过程已经过优化,可以通过最大化观察的可能性来产生稀疏的时间码。本文还提出了一种估计分数β值的方法。在分离单个通道上录制的音频混合中演示了该方法。该论文表明,通过使用所提出的算法,频谱字典和时间码的提取明显更有效,并且随后导致更好的源分离性能。已经进行了实验测试以及与其他分解方法的比较,以验证其功效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号