首页> 外文期刊>IEE proceedings, Part K. Vision, image and signal processing >Strategies to improve the performance of very low bit rate speech coders and application to a variable rate 1.2 kb/s codec
【24h】

Strategies to improve the performance of very low bit rate speech coders and application to a variable rate 1.2 kb/s codec

机译:改善极低比特率语音编码器性能的策略以及应用于可变速率1.2 kb / s编解码器的策略

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents several strategies to improve the performance of very low bit rate speech coders and describes a speech codec that incorporates these strategies and operates at an average bit rate of 1.2 kb/s. The encoding algorithm is based on several improvements in a mixed multiband excitation (MMBE) linear predictive coding (LPC) structure. A switched-predictive vector quantiser technique that outperforms previously reported schemes is adopted to encode the LSF parameters. Spectral and sound specific low rate models are used in order to achieve high quality speech at low rates. An MMBE approach with three sub-bands is employed to encode voiced frames, while fricatives and stops modelling and synthesis techniques are used for unvoiced frames. This strategy is shown to provide good quality synthesised speech, at a bit rate of only 0.4 kb/s for unvoiced frames. To reduce coding noise and improve decoded speech, spectral envelope restoration combined with noise reduction (SERNR) postfilter is used. The contributions of the techniques described in this paper are separately assessed and then combined in the design of a low bit rate codec that is evaluated against the North American Mixed Excitation Linear Prediction (MELP) coder. The performance assessment is carried out in terms of the spectral distortion of LSF quantisation, mean opinion score (MOS), A/B comparison tests and the ITU-T P.862 perceptual evaluation of speech quality (PESQ) standard. Assessment results show that the improved methods for LSF quantisation, sound specific modelling and synthesis and the new postfiltering approach can significantly outperform previously reported techniques. Further results also indicate that a system combining the proposed improvements and operating at 1.2 kb/s, is comparable (slightly outperforming) a MELP coder operating at 2.4 kb/s. For tandem connection situations, the proposed system is clearly superior to the MELP coder.
机译:本文提出了几种提高非常低比特率语音编码器性能的策略,并介绍了一种结合了这些策略并以1.2 kb / s的平均比特率运行的语音编解码器。编码算法基于混合多带激励(MMBE)线性预测编码(LPC)结构的多项改进。采用优于先前报道的方案的开关预测矢量量化器技术来编码LSF参数。使用频谱和声音特定的低速率模型,以便以低速率获得高质量的语音。带有三个子带的MMBE方法用于编码浊音帧,而摩擦和停止模型以及合成技术用于清音帧。示出了该策略提供了高质量的合成语音,对于不发声的帧,其比特率仅为0.4 kb / s。为了减少编码噪声并改善解码语音,使用了结合了降噪(SERNR)后置滤波器的频谱包络恢复。本文中描述的技术的贡献将被分别评估,然后结合到低比特率编解码器的设计中,该编解码器将根据北美混合激励线性预测(MELP)编码器进行评估。性能评估是根据LSF量化的频谱失真,平均意见得分(MOS),A / B比较测试和ITU-T P.862语音质量感知评估(PESQ)标准进行的。评估结果表明,改进的LSF量化方法,声音特定的建模和合成方法以及新的后置滤波方法可以大大优于以前报道的技术。进一步的结果还表明,结合了所提出的改进并以1.2 kb / s的速度运行的系统与以2.4 kb / s的速度运行的MELP编码器相当(略胜一筹)。对于串联连接的情况,建议的系统明显优于MELP编码器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号