首页> 外文期刊>International Journal of Computers & Applications >INCORPORATING PHONETIC KNOWLEDGE INTO AN EVOLUTIONARY SUBSPACE APPROACH FOR ROBUST SPEECH RECOGNITION
【24h】

INCORPORATING PHONETIC KNOWLEDGE INTO AN EVOLUTIONARY SUBSPACE APPROACH FOR ROBUST SPEECH RECOGNITION

机译:将语音知识纳入进化的子空间方法中以进行强健的语音识别

获取原文
获取原文并翻译 | 示例
       

摘要

The reliability of automatic speech recognition (ASR) systems is closely related to the parameterization process which is expected to accurately characterize the phonetic, dynamic and static components in speech. For this purpose, ASR methods build speech sound models based on large speech corpora that attempt to include common sources of variability that may occur in real-life conditions. Nevertheless, not all variabilities can reasonably be covered. For that reason, the performance of current ASR systems, whose designs are predicated on relatively noise-free conditions, degrades rapidly in the presence of high-level adverse conditions. To cope with mismatched (adverse) conditions and to achieve noise robustness, we present in this paper an original approach that operates in two steps. The first one consists of integrating in the front-end process, besides mean-subtracted mel-frequency cepstral coefficients, acoustic distinctive features that provides a more convenient interface to higher-level components of ASR systems. The second step consists of combining subspace filtering and Genetic Algorithms to get less-variant parameters. The advantages of this approach include that no estimation of noise is required and the recognition system is not modified. The effectiveness of the method is assessed in high interfering car noise by using a noisy subset of the TIMIT database. Obtained results show that the proposed method reduces drastically the word error rate for a wide range of signal-to-noise ratios.
机译:自动语音识别(ASR)系统的可靠性与参数化过程密切相关,该过程有望准确表征语音中的语音,动态和静态成分。为此,ASR方法基于大型语音库构建语音模型,这些语音库试图包括现实情况下可能出现的常见变异性来源。然而,并非所有的变化都可以合理地涵盖。因此,当前的ASR系统(其设计基于相对无噪声的条件)的性能会在存在严重不利条​​件的情况下迅速降低。为了应对不匹配的(不利)条件并实现噪声鲁棒性,我们在本文中提出了一种原始方法,该方法分两步进行。第一个功能包括在前端过程中进行整合,除了均值减去梅尔频率倒谱系数外,声学独特的功能还为ASR系统的更高级别的组件提供了更方便的接口。第二步包括将子空间过滤和遗传算法相结合,以获取变化较小的参数。该方法的优点包括不需要估计噪声并且不修改识别系统。通过使用TIMIT数据库的嘈杂子集,可以在高干扰汽车噪声中评估该方法的有效性。所得结果表明,对于宽范围的信噪比,该方法可以大大降低单词错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号