首页> 外文期刊>IEEE Transactions on Speech and Audio Proceessing >Perceptual speech processing and phonetic feature mapping forrobust vowel recognition
【24h】

Perceptual speech processing and phonetic feature mapping forrobust vowel recognition

机译:感知语音处理和语音特征映射可增强元音识别能力

获取原文
获取原文并翻译 | 示例
           

摘要

We propose perceptual speech processing and phonetic feature mapping, which are inspired by the human auditory perceptual characteristics. The proposed perceptual speech processing is based on three perceptual characteristics and consists of three independent processing steps: masking effect, minimum audible field renormalization, and mel-scale resampling. They remove unperceptible spectral components, and adjust the magnitude and frequency scales of speech spectra, respectively. We apply these three processing steps to the speech spectrum sequentially to generate a new speech signal representation called the perceptual spectrum. For Mandarin vowel recognition, nine representative vowels are selected as references and similarity measures to these reference spectra, called phonetic features, are then generated from the perceptual spectrum. These phonetic features then serve as speech parameters in a continuous HMM-based recognition, stage. With these two techniques, a high recognition accuracy on Mandarin vowel phonemes has been achieved. Further experiments confirm that significant improvement on recognition robustness with respect to speaker variation and noise contamination can also obtained
机译:我们提出了受人类听觉感知特性启发的感知语音处理和语音特征映射。所提出的感知语音处理基于三个感知特征,并且包括三个独立的处理步骤:掩蔽效果,最小可听场重归一化和梅尔音阶重采样。它们消除了无法感知的频谱分量,并分别调整了语音频谱的幅度和频率范围。我们将这三个处理步骤依次应用于语音频谱,以生成称为语音频谱的新语音信号表示形式。对于普通话元音识别,选择了九个代表性元音作为参考,然后从感知光谱中生成与这些参考光谱的相似性度量(称为语音特征)。这些语音特征然后在基于HMM的连续识别阶段中用作语音参数。通过这两种技术,已经实现了对普通话元音音素的高识别精度。进一步的实验证实,在说话人变化和噪声污染方面,识别鲁棒性也得到了显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号