首页> 外文期刊>IEEE transactions on audio, speech and language processing >Sparse Linear Regression With Structured Priors and Application to Denoising of Musical Audio
【24h】

Sparse Linear Regression With Structured Priors and Application to Denoising of Musical Audio

机译:结构化先验的稀疏线性回归及其在音乐音频去噪中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

We describe in this paper an audio denoising technique based on sparse linear regression with structured priors. The noisy signal is decomposed as a linear combination of atoms belonging to two modified discrete cosine transform (MDCT) bases, plus a residual part containing the noise. One MDCT basis has a long time resolution, and thus high frequency resolution, and is aimed at modeling tonal parts of the signal, while the other MDCT basis has short time resolution and is aimed at modeling transient parts (such as attacks of notes). The problem is formulated within a Bayesian setting. Conditional upon an indicator variable which is either 0 or 1, one expansion coefficient is set to zero or given a hierarchical prior. Structured priors are employed for the indicator variables; using two types of Markov chains, persistency along the time axis is favored for expansion coefficients of the tonal layer, while persistency along the frequency axis is favored for the expansion coefficients of the transient layer. Inference about the denoised signal and model parameters is performed using a Gibbs sampler, a standard Markov chain Monte Carlo (MCMC) sampling technique. We present results for denoising of a short glockenspiel excerpt and a long polyphonic music excerpt. Our approach is compared with unstructured sparse regression and with structured sparse regression in a single resolution MDCT basis (no transient layer). The results show that better denoising is obtained, both from signal-to-noise ratio measurements and from subjective criteria, when both a transient and tonal layer are used, in conjunction with our proposed structured prior framework.
机译:我们在本文中描述了一种基于稀疏线性回归和结构化先验的音频降噪技术。噪声信号被分解为属于两个改进的离散余弦变换(MDCT)基的原子的线性组合,再加上包含噪声的残差部分。一个MDCT基础具有较长的时间分辨率,因此具有较高的频率分辨率,旨在对信号的音调部分进行建模,而另一种MDCT基础具有较短的时间分辨率,旨在对瞬态部分(例如音符的攻击)进行建模。该问题是在贝叶斯环境中提出的。以指示符变量为0或1为条件,将一个扩展系数设置为零或给定优先级。指标变量采用结构化先验;使用两种类型的马尔可夫链,沿时间轴的持久性有利于音调层的扩展系数,而沿频率轴的持久性有利于过渡层的扩展系数。使用Gibbs采样器(标准的马尔可夫链蒙特卡罗(MCMC)采样技术)可以对降噪后的信号和模型参数进行推断。我们提出了一个短的钟琴节摘录和一个长的复音音乐节选的去噪结果。在单分辨率MDCT基础上(无瞬态层),将我们的方法与非结构化稀疏回归和结构化稀疏回归进行了比较。结果表明,当同时使用过渡层和音调层时,结合我们提出的结构化现有框架,可以从信噪比测量和主观标准获得更好的降噪效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号