首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >SPECTRO-TEMPORAL FEATURES FOR NOISE-ROBUST SPEECH RECOGNITION USING POWER-LAW NONLINEARITY AND POWER-BIAS SUBTRACTION
【24h】

SPECTRO-TEMPORAL FEATURES FOR NOISE-ROBUST SPEECH RECOGNITION USING POWER-LAW NONLINEARITY AND POWER-BIAS SUBTRACTION

机译:使用Power-Law非线性和Power-Bias减法的噪声 - 稳健语音识别的光谱 - 时间特征

获取原文

摘要

Previous work has demonstrated that spectro-temporal Gabor features reduced word error rates for automatic speech recognition under noisy conditions. However, the features based on mel spectra were easily corrupted in the presence of noise or channel distortion. We have exploited an algorithm for power normalized cepstral coefficients (PNCCs) to generate a more robust spectro-temporal representation. We refer to it as power normalized spectrum (PNS), and to the corresponding output processed by Gabor filters and MLP nonlinear weighting as PNS-Gabor, We show that the proposed feature outperforms state-of-the-art noise-robust features, ETSI-AFE and PNCC for both Aurora2 and a noisy version of the Wall Street Jounal (WSJ) corpus. A comparison of the individual processing steps of mel spectra and PNS shows that power bias subtraction is the most important aspect of PNS-Gabor features to provide an improvement over Mel-Gabor features. The result indicates that Gabor processing compensates the limitation of PNCC for channels with frequency-shift characteristic. Overall, PNS-Gabor features decrease the word error rate by 32% relative to MFCC and 13% relative to PNCC in Aurora2. For noisy WSJ, they decrease the word error rate by 30.9% relative to MFCC and 24.7% relative to PNCC.
机译:以前的工作表明,光谱 - 时间Gabor在嘈杂的条件下为自动语音识别进行了减少的单词误差速率。然而,基于MEL光谱的特征在存在噪声或通道失真的情况下容易损坏。我们利用了一种用于功率归一化谱系齐数系数(PNCC)的算法来生成更强大的光谱 - 时间表示。我们将其称为功率归一化频谱(PNS),并由Gabor滤波器和MLP非线性加权处理的相应输出作为PNS-Gabor,我们表明该特征优于最先进的噪声强度特征ETSI -afora2和PNCC的Aurora2和Wall Street Jounal(WSJ)语料库的嘈杂版本。 MEL光谱和PNS的各个处理步骤的比较表明,功率偏压减法是PNS-Gabor特征的最重要方面,以提供对MEL-GABOR特征的改进。结果表明,Gabor处理补偿了具有频移特性的通道的PNCC的限制。总的来说,PNS-Gabor特征在Aurora2中,相对于PNCC的MFCC和13%的13%减少了32%。对于嘈杂的WSJ,它们相对于MFCC和24.7%相对于PNCC将字错误率降低30.9%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号