首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Wrapped Gaussian Mixture Models for Modeling and High-Rate Quantization of Phase Data of Speech
【24h】

Wrapped Gaussian Mixture Models for Modeling and High-Rate Quantization of Phase Data of Speech

机译:包裹式高斯混合模型用于语音相位数据的建模和高速量化

获取原文
获取原文并翻译 | 示例

摘要

The harmonic representation of speech signals has found many applications in speech processing. This paper presents a novel statistical approach to model the behavior of harmonic phases. Phase information is decomposed into three parts: a minimum phase part, a translation term, and a residual term referred to as dispersion phase. Dispersion phases are modeled by wrapped Gaussian mixture models (WGMMs) using an expectation-maximization algorithm suitable for circular vector data. A multivariate WGMM-based phase quantizer is then proposed and constructed using novel scalar quantizers for circular random variables. The proposed phase modeling and quantization scheme is evaluated in the context of a narrowband harmonic representation of speech. Results indicate that it is possible to construct a variable-rate harmonic codec that is equivalent to iLBC at approximately 13 kbps.
机译:语音信号的谐波表示已在语音处理中找到了许多应用。本文提出了一种新颖的统计方法来建模谐波相的行为。相位信息被分解为三个部分:最小相位部分,转换项和称为分散相的剩余项。使用适用于圆形矢量数据的期望最大化算法,通过包裹的高斯混合模型(WGMM)对分散相进行建模。然后,提出了一种基于WGMM的多元相位量化器,并使用新颖的标量量化器构建了循环随机变量。建议的相位建模和量化方案是在语音的窄带谐波表示的背景下进行评估的。结果表明有可能构建一个可变速率谐波编解码器,该编解码器等效于大约13 kbps的iLBC。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号