首页> 外文会议>2011 Data Compression Conference >Hybrid Scalar/Vector Quantization of Mel-Frequency Cepstral Coefficients for Low Bit-Rate Coding of Speech
【24h】

Hybrid Scalar/Vector Quantization of Mel-Frequency Cepstral Coefficients for Low Bit-Rate Coding of Speech

机译:语音的低比特率编码的Mel频率倒谱系数的混合标量/矢量量化

获取原文

摘要

In this paper, we propose a low bit-rate speech codec based on a hybrid scalar/vector quantization of the mel-frequency cepstral coefficients (MFCCs). We begin by showing that if a high-resolution mel-frequency cepstrum (MFC) is computed, good-quality speech reconstruction is possible from the MFCCs despite the lack of explicit phase information. By evaluating the contribution toward speech quality that individual MFCCs make and applying appropriate quantization, our results show perceptual evaluation of speech quality (PESQ) of the MFCC-based codec matches the state-of-the-art MELPe codec at 600 bps and exceeds the CELP codec at 2000 -- 4000 bps coding rates. The main advantage of the proposed codec is in distributed speech recognition (DSR) since speech features based on MFCCs can be directly obtained from code words thus eliminating additional decode and feature extract stages.
机译:在本文中,我们提出了一种基于梅尔频率倒谱系数(MFCC)的混合标量/矢量量化的低比特率语音编解码器。我们首先显示,如果计算出高分辨率的mel-cepstrum(MFC),尽管缺少明确的相位信息,但仍可以从MFCC进行高质量的语音重建。通过评估各个MFCC对语音质量的贡献并应用适当的量化,我们的结果表明,基于MFCC的编解码器对语音质量(PESQ)的感知评估与600 bps的最新MELPe编解码器相匹配,并且超出了CELP编解码器的编码速率为2000-4000 bps。所提出的编解码器的主要优点在于分布式语音识别(DSR),因为可以直接从代码字获得基于MFCC的语音特征,从而消除了额外的解码和特征提取阶段。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号