首页> 外文期刊>IEEE Transactions on Signal Processing >Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms
【24h】

Perceptual speech coding and enhancement using frame-synchronized fast wavelet packet transform algorithms

机译:使用帧同步快速小波包变换算法的感知语音编码和增强

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents new wideband speech coding and integrated speech coding-enhancement systems based on frame-synchronized fast wavelet packet transform algorithms. It also formulates temporal and spectral psychoacoustic models of masking adapted to wavelet packet analysis. The algorithm of the proposed FFT-like overlapped block orthogonal wavelet packet transform permits us to efficiently approximate the auditory critical band decomposition in the time and frequency domains. This allows us to make use of the temporal and spectral masking properties of the human auditory system to decrease the average bit rate of the encoder while perceptually hiding the quantization error. The same wavelet packet representation is used to merge speech enhancement and coding in the context of auditory modeling. The advantage of the method presented in this paper over previous approaches is that perceptual enhancement and coding, which is usually implemented as a cascade of two separate systems, are combined. This leads to a decreased computational load. Experiments show that the proposed wideband coding procedure by itself can achieve transparent coding of speech signals sampled at 16 kHz at an average bit rate of 39.4 kbit/s. The combined speech coding-enhancement procedure achieves higher bit rate values that depend on the residual noise characteristics at the output of the enhancement process.
机译:本文提出了一种新的基于帧同步快速小波包变换算法的宽带语音编码和集成语音编码增强系统。它还制定了适用于小波包分析的掩蔽的时间和频谱心理声学模型。提出的类似FFT的重叠块正交小波包变换算法使我们能够在时域和频域有效地近似听觉临界带分解。这使我们能够利用人类听觉系统的时间和频谱掩蔽属性来降低编码器的平均比特率,同时在感知上隐藏量化误差。在听觉建模的上下文中,相同的小波包表示用于合并语音增强和编码。与以前的方法相比,本文提出的方法的优势在于,通常将感知增强和编码(通常作为两个独立系统的级联实现)组合在一起。这导致减少的计算负荷。实验表明,所提出的宽带编码程序本身可以实现以39.4 kbit / s的平均比特率对16 kHz采样的语音信号进行透明编码。组合的语音编码增强过程实现了更高的比特率值,该值取决于增强过程输出处的残留噪声特性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号