...
首页> 外文期刊>IEICE transactions on information and systems >Statistical Bandwidth Extension for Speech Synthesis Based on Gaussian Mixture Model with Sub-Band Basis Spectrum Model
【24h】

Statistical Bandwidth Extension for Speech Synthesis Based on Gaussian Mixture Model with Sub-Band Basis Spectrum Model

机译:基于子带基谱模型的高斯混合模型的语音合成统计带宽扩展

获取原文
           

摘要

This paper describes a novel statistical bandwidth extension (BWE) technique based on a Gaussian mixture model (GMM) and a sub-band basis spectrum model (SBM), in which each dimensional component represents a specific acoustic space in the frequency domain. The proposed method can achieve the BWE from speech data with an arbitrary frequency bandwidth whereas the conventional methods perform the conversion from fixed narrow-band data. In the proposed method, we train a GMM with SBM parameters extracted from full-band spectra in advance. According to the bandwidth of input signal, the trained GMM is reconstructed to the GMM of the joint probability density between low-band SBM and high-band SBM components. Then high-band SBM components are estimated from low-band SBM components of the input signal based on the reconstructed GMM. Finally, BWE is achieved by adding the spectra decoded from estimated high-band SBM components to the ones of the input signal. To construct the full-band signal from the narrow-band one, we apply this method to log-amplitude spectra and aperiodic components. Objective and subjective evaluation results show that the proposed method extends the bandwidth of speech data robustly for the log-amplitude spectra. Experimental results also indicate that the aperiodic component extracted from the upsampled narrow-band signal realizes the same performance as the restored and the full-band aperiodic components in the proposed method.
机译:本文介绍了一种基于高斯混合模型(GMM)和子带基础频谱模型(SBM)的新颖统计带宽扩展(BWE)技术,其中每个维分量代表频域中的特定声学空间。所提出的方法可以从具有任意频率带宽的语音数据中实现BWE,而常规方法则可以从固定的窄带数据中进行转换。在提出的方法中,我们预先训练了具有从全频段频谱中提取的SBM参数的GMM。根据输入信号的带宽,将训练后的GMM重构为低频带SBM和高频带SBM分量之间联合概率密度的GMM。然后,基于重构的GMM,从输入信号的低频带SBM分量中估计高频带SBM分量。最后,通过将从估计的高频带SBM分量解码的频谱添加到输入信号中,来实现BWE。为了从窄带信号构建全带信号,我们将该方法应用于对数幅度谱和非周期性分量。客观和主观的评估结果表明,该方法针对对数幅度谱稳健地扩展了语音数据的带宽。实验结果还表明,从上采样的窄带信号中提取的非周期性分量实现了与所提方法中恢复的全频带非周期性分量相同的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号