...
首页> 外文期刊>EURASIP journal on audio, speech, and music processing >Audio bandwidth extension based on temporal smoothing cepstral coefficients
【24h】

Audio bandwidth extension based on temporal smoothing cepstral coefficients

机译:基于时间平滑倒谱系数的音频带宽扩展

获取原文
           

摘要

In this paper, we propose a wideband (WB) to super-wideband audio bandwidth extension (BWE) method based on temporal smoothing cepstral coefficients (TSCC). A temporal relationship of audio signals is included into feature extraction in the bandwidth extension frontend to make the temporal evolution of the extended spectra smoother. In the bandwidth extension scheme, a Gammatone auditory filter bank is used to decompose the audio signal, and the energy of each frequency band is long-term smoothed using minima controlled recursive averaging (MCRA) in order to suppress transient components. The resulting ‘steady-state’ spectrum is processed by frequency weighting, and the temporal smoothing cepstral coefficients are obtained by means of the power-law loudness function and cepstral normalization. The extracted temporal smoothing cepstral coefficients are fed into a Gaussian mixture model (GMM)-based Bayesian estimator to estimate the high-frequency (HF) spectral envelope, while the fine structure is restored by spectral translation. Evaluation results show that the temporal smoothing cepstral coefficients exploit the temporal relationship of audio signals and provide higher mutual information between the low- and high-frequency parameters, without increasing the dimension of input vectors in the frontend of bandwidth extension systems. In addition, the proposed bandwidth extension method is applied into the G.729.1 wideband codec and outperforms the Mel frequency cepstral coefficient (MFCC)-based method in terms of log spectral distortion (LSD), cosh measure, and differential log spectral distortion. Further, the proposed method improves the smoothness of the reconstructed spectrum over time and also gains a good performance in the subjective listening tests.
机译:本文提出了一种基于时间平滑倒频谱系数(TSCC)的宽带(WB)至超宽带音频带宽扩展(BWE)方法。音频信号的时间关系包含在带宽扩展前端的特征提取中,以使扩展频谱的时间演化更加平滑。在带宽扩展方案中,使用Gammatone听觉滤波器组分解音频信号,并使用最小控制递归平均(MCRA)对每个频带的能量进行长期平滑处理,以抑制瞬态分量。通过频率加权处理得到的“稳态”频谱,并通过幂律响度函数和倒频谱归一化获得时间平滑倒频谱系数。提取的时间平滑倒谱系数被馈送到基于高斯混合模型(GMM)的贝叶斯估计器中,以估计高频(HF)频谱包络,而精细结构通过频谱平移得以恢复。评估结果表明,时间平滑倒频谱系数利用了音频信号的时间关系,并在低频和高频参数之间提供了更高的互信息,而没有增加带宽扩展系统前端的输入矢量的维数。此外,所提出的带宽扩展方法已应用于G.729.1宽带编解码器,并且在对数谱失真(LSD),cosh量度和差分对数谱失真方面均优于基于梅尔频率倒谱系数(MFCC)的方法。此外,所提出的方法提高了重建频谱随时间的平滑度,并且在主观听力测试中也获得了良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号