首页> 外文会议>Odyssey 2010: the speaker and language recognition workshop >Temporally Weighted Linear Prediction Features for Speaker Verification in Additive Noise
【24h】

Temporally Weighted Linear Prediction Features for Speaker Verification in Additive Noise

机译:临时加权线性预测功能,用于在加性噪声中验证说话人

获取原文
获取原文并翻译 | 示例

摘要

We consider text-independent speaker verification under additive noise corruption. In the popular mel-frequency cepstral coefficient (MFCC) front-end, we substitute the conventional Fourier-based spectrum estimation with weighted linear predictive methods, which have earlier shown success in noise-robust speech recognition. We introduce two temporally weighted variants of linear predictive (LP) modeling to speaker verification and compare them to FFT, which is normally used in computing MFCCs, and to conventional LP. We also investigate the effect of speech enhancement (spectral subtraction) on the system performance with each of the four feature representations. Our experiments on the NIST 2002 SRE corpus indicate that the accuracy of the conventional and proposed features are close to each other on clean data. On 0 dB SNR level, baseline FFT and the better of the proposed features give EERs of 17.4 % and 15.6 %, respectively. These accuracies improve to 11.6 % and 11.2 %, respectively, when spectral subtraction is included as a pre-processing method. The new features hold a promise for noise-robust speaker verification.
机译:我们考虑在加性噪声破坏下独立于文本的说话者验证。在流行的梅尔频率倒谱系数(MFCC)前端,我们用加权线性预测方法代替了传统的基于傅立叶的频谱估计,该方法早先在噪声鲁棒的语音识别中取得了成功。我们将线性预测(LP)建模的两个时间加权变量引入说话者验证,并将它们与通常用于计算MFCC的FFT和常规LP进行比较。我们还研究了语音增强(频谱减法)对系统性能的四个特征表示的影响。我们在NIST 2002 SRE语料库上进行的实验表明,在干净的数据上,常规功能和建议功能的准确性彼此接近。在SNR为0 dB的情况下,基线FFT和更好的拟议功能可使EER分别为17.4%和15.6%。当包括光谱减法作为预处理方法时,这些精度分别提高到11.6%和11.2%。这些新功能有望实现对噪声的扬声器验证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号