首页> 外文期刊>IETE Technical Review >Fuzzy Phoneme Classification Using Multi-speaker Vocal Tract Length Normalization
【24h】

Fuzzy Phoneme Classification Using Multi-speaker Vocal Tract Length Normalization

机译:多说话人语音长度归一化的模糊音素分类

获取原文
获取原文并翻译 | 示例
           

摘要

The overall success of automatic speech recognition (ASR) depends on efficient phoneme recognition performance and quality of speech signal received in ASR. However, dissimilar inputs of speakers affect the overall recognition performance. One of the main problems that affect recognition performance is inter-speaker variability. Vocal tract length normalization (VTLN) is introduced to compensate inter-speaker variation on the speaker signal by applying speaker-specific warping of the frequency scale of a filter bank. Instead of measuring the performance on word level with speaker-specific warping, this research focuses on direct tackling at the phoneme level and applying VTLN on all speakers' speech signals to analyse the best setting for the highest recognition performance. This research seeks to compare each phoneme recognition results from warping factor between 0.74 and 1.54 with 0.02 increments on nine different ranges of frequency warping boundary. The warp factor and frequency warping range that provides the highest phoneme recognition performance is applied on word recognition. The results show an improved performance in phoneme recognition by 0.7% and spoken word recognition by 0.5% using warp factor of 1.40 on frequency range of 300-5000 Hz in comparison to baseline results.
机译:自动语音识别(ASR)的总体成功取决于有效的音素识别性能和在ASR中接收到的语音信号的质量。但是,说话人的不同输入会影响整体识别性能。影响识别性能的主要问题之一是说话者之间的变异性。引入了声道长度归一化(VTLN),通过应用特定于滤波器组频率范围的扬声器的扭曲来补偿扬声器信号上的扬声器间变化。这项研究不是通过特定于说话者的变形来测量单词级别的性能,而是着眼于在音素级别上的直接定位并将VTLN应用于所有讲话者的语音信号,以分析获得最高识别性能的最佳设置。这项研究试图比较变形因子在0.74和1.54之间的每个音素识别结果,并在九个不同的频率变形边界范围上以0.02的增量进行比较。提供最高音素识别性能的扭曲因子和频率扭曲范围应用于单词识别。结果显示,与基线结果相比,在300-5000 Hz频率范围内使用1.40的翘曲因子,音素识别性能提高了0.7%,语音识别率提高了0.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号