首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >An Auditory-Based Feature Extraction Algorithm for Robust Speaker Identification Under Mismatched Conditions
【24h】

An Auditory-Based Feature Extraction Algorithm for Robust Speaker Identification Under Mismatched Conditions

机译:不匹配条件下基于听觉的说话人特征提取算法

获取原文
获取原文并翻译 | 示例

摘要

An auditory-based feature extraction algorithm is presented. We name the new features as cochlear filter cepstral coefficients (CFCCs) which are defined based on a recently developed auditory transform (AT) plus a set of modules to emulate the signal processing functions in the cochlea. The CFCC features are applied to a speaker identification task to address the acoustic mismatch problem between training and testing environments. Usually, the performance of acoustic models trained in clean speech drops significantly when tested in noisy speech. The CFCC features have shown strong robustness in this kind of situation. In our experiments, the CFCC features consistently perform better than the baseline MFCC features under all three mismatched testing conditions-white noise, car noise, and babble noise. For example, in clean conditions, both MFCC and CFCC features perform similarly, over 96%, but when the signal-to-noise ratio (SNR) of the input signal is 6 dB, the accuracy of the MFCC features drops to 41.2%, while the CFCC features still achieve an accuracy of 88.3%. The proposed CFCC features also compare favorably to perceptual linear predictive (PLP) and RASTA-PLP features. The CFCC features consistently perform much better than PLP. Under white noise, the CFCC features are significantly better than RASTA-PLP, while under car and babble noise, the CFCC features provide similar performances to RASTA-PLP.
机译:提出了一种基于听觉的特征提取算法。我们将这些新功能命名为耳蜗滤波器倒频谱系数(CFCC),它是基于最近开发的听觉转换(AT)以及一组模块来模拟耳蜗中的信号处理功能而定义的。 CFCC功能应用于演讲者识别任务,以解决培训和测试环境之间的声学​​失配问题。通常,在嘈杂的语音中进行测试时,在纯净语音中训练的声学模型的性能会大大下降。 CFCC功能在这种情况下显示出强大的鲁棒性。在我们的实验中,在所有三种不匹配的测试条件下(白噪声,汽车噪声和ba声),CFCC功能始终优于基线MFCC功能。例如,在干净的条件下,MFCC和CFCC功能的性能相似,均超过96%,但是当输入信号的信噪比(SNR)为6 dB时,MFCC功能的精度将下降至41.2%,而CFCC功能仍可达到88.3%的精度。所提出的CFCC功能也与感知线性预测(PLP)和RASTA-PLP功能相比具有优势。 CFCC功能始终比PLP更好。在白噪声下,CFCC功能明显优于RASTA-PLP,而在汽车和杂音下,CFCC功能可提供与RASTA-PLP相似的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号