首页> 外文期刊>IEEE Transactions on Speech and Audio Proceessing >Speaker identification based on the use of robust cepstral featuresobtained from pole-zero transfer functions
【24h】

Speaker identification based on the use of robust cepstral featuresobtained from pole-zero transfer functions

机译:基于从零极点传递函数获得的强大倒谱特性的说话人识别

获取原文
获取原文并翻译 | 示例

摘要

A common problem in speaker identification systems is that a mismatch in the training and testing conditions sacrifices much performance. We attempt to alleviate this problem by proposing new features that show less variation when speech is corrupted by convolutional noise (channel) and/or additive noise. The conventional feature used is the linear predictive (LP) cepstrum that is derived from an all-pole transfer function which, in turn, achieves a good approximation to the spectral envelope of the speech. A different cepstral feature based on a pole-zero function (called the adaptive component weighted or ACW cepstrum) was previously introduced. We propose four additional new cepstral features based on pole-zero transfer functions. One is an alternative way of doing adaptive component weighting and is called the ACW2 cepstrum. Two others (known as the PFL1 cepstrum and the PFL2 cepstrum) are based on a pole-zero postfilter used in speech enhancement. Finally, an autoregressive moving-average (ARMA) analysis of speech results in a pole-zero transfer function describing the spectral envelope. The cepstrum of this transfer function is the feature. Experiments involving a closed set, text-independent and vector quantizer based speaker identification system are done to compare the various features. The TIMIT and King databases are used. The ACW and PFL1 features are the preferred features, since they do as well or better than the LP cepstrum for all the test conditions. The corresponding spectra show a clear emphasis of the formants and no spectral tilt
机译:说话人识别系统中的一个普遍问题是训练和测试条件的不匹配会牺牲很多性能。我们尝试通过提出一些新功能来缓解此问题,这些新功能在语音被卷积噪声(通道)和/或加性噪声破坏时显示较少的变化。所使用的常规功能是线性预测(LP)倒频谱,它是从全极点传递函数得出的,该函数进而可以很好地近似语音的频谱包络。先前已引入了基于零极点函数的不同倒谱特征(称为自适应分量加权或ACW倒谱)。我们提出了基于零极点传递函数的四个其他新的倒谱特性。一种是进行自适应分量加权的替代方法,称为ACW2倒谱。其他两个(称为PFL1倒谱和PFL2倒谱)基于语音增强中使用的零极点后置滤波器。最后,语音的自回归移动平均(ARMA)分析导致零极点传递函数描述了频谱包络。此传递函数的倒频谱就是功能。进行了涉及封闭集,独立于文本和基于矢量量化器的说话人识别系统的实验,以比较各种功能。使用TIMIT和King数据库。 ACW和PFL1功能是首选功能,因为在所有测试条件下,它们的性能都比LP倒谱好或更好。相应的光谱清晰显示共振峰,无光谱倾斜

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号