首页> 外文期刊>Speech Communication >Boosting HMM acoustic models in large vocabulary speech recognition
【24h】

Boosting HMM acoustic models in large vocabulary speech recognition

机译:在大词汇量语音识别中增强HMM声学模型

获取原文
获取原文并翻译 | 示例
           

摘要

Boosting algorithms have been successfully used to improve performance in a variety of classification tasks. Here, we suggest an approach to apply a popular boosting algorithm (called "AdaBoost.M2") to Hidden Markov Model based speech recognizers, at the level of utterances. In a variety of recognition tasks we show that boosting significantly improves the best test error rates obtained with standard maximum likelihood training. In addition, results in several isolated word decoding experiments show that boosting may also provide further performance gains over discriminative training, when both training techniques are combined. In our experiments this also holds when comparing final classifiers with a similar number of parameters and when evaluating in decoding conditions with lexical and acoustic mismatch to the training conditions. Moreover, we present an extension of our algorithm to large vocabulary continuous speech recognition, allowing online recognition without further processing of N-best lists or word lattices. This is achieved by using a lexical approach for combining different acoustic models in decoding. In particular, we introduce a weighted summation over an extended set of alternative pronunciation models representing both the boosted models and the baseline model. In this way, arbitrarily long utterances can be recognized by the boosted ensemble in a single pass decoding framework. Evaluation results are presented on two tasks: a real-life spontaneous speech dictation task with a 60k word vocabulary and Switchboard. (C) 2005 Elsevier B.V. All rights reserved.
机译:提升算法已成功用于改善各种分类任务中的性能。在这里,我们建议一种在语音级别上将流行的增强算法(称为“ AdaBoost.M2”)应用于基于隐马尔可夫模型的语音识别器的方法。在各种识别任务中,我们证明了增强可以显着提高通过标准最大似然训练获得的最佳测试错误率。另外,几个孤立的单词解码实验的结果表明,当两种训练技术结合在一起时,增强能力也可以提供比歧视性训练更高的性能。在我们的实验中,当比较具有相似数量参数的最终分类器时,以及在词汇条件和声学条件与训练条件不匹配的解码条件下进行评估时,也是如此。此外,我们提出了将算法扩展到大词汇量连续语音识别的功能,无需进一步处理N个最佳列表或单词格即可进行在线识别。这是通过使用词汇方法在解码中组合不同的声学模型来实现的。特别是,我们在表示增强模型和基准模型的扩展替代发音模型集合中引入加权求和。以此方式,在单遍解码框架中,增强的合奏可以识别任意长的话语。评估结果显示在两个任务上:带有60k单词词汇量的实时自发语音听写任务和总机。 (C)2005 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号