首页> 外文期刊>IEEE Transactions on Speech and Audio Proceeding >Utterance verification in continuous speech recognition: decoding and training procedures
【24h】

Utterance verification in continuous speech recognition: decoding and training procedures

机译:连续语音识别中的话语验证:解码和训练程序

获取原文
获取原文并翻译 | 示例

摘要

This paper introduces a set of acoustic modeling and decoding techniques for utterance verification (UV) in hidden Markov model (HMM) based continuous speech recognition (CSR). Utterance verification in this work implies the ability to determine when portions of a hypothesized word string correspond to incorrectly decoded vocabulary words or out-of-vocabulary words that may appear in an utterance. This capability is implemented here as a likelihood ratio (LR) based hypothesis testing procedure for, verifying individual words in a decoded string. There are two UV techniques that are presented here. The first is a procedure for estimating the parameters of UV models during training according to an optimization criterion which is directly related to the LR measure used in UV. The second technique is a speech recognition decoding procedure where the "best" decoded path is defined to be that which optimizes a LR criterion. These techniques were evaluated in terms of their ability to improve UV performance on a speech dialog task over the public switched telephone network. The results of an experimental study presented in the paper shows that LR based parameter estimation results in a significant improvement in UV performance for this task. The study also found that the use of the LR based decoding procedure, when used in conjunction with models trained using the LR criterion, can provide as much as an 11% improvement in UV performance when compared to existing UV procedures. Finally, it was also found that the performance of the LR decoder was highly dependent on the use of the LR criterion in training acoustic models. Several observations are made in the paper concerning the formation of confidence measures for UV and the interaction of these techniques with statistical language models used in ASR.
机译:本文介绍了一套用于基于隐马尔可夫模型(HMM)的连续语音识别(CSR)中的发声验证(UV)的声学建模和解码技术。这项工作中的话语验证意味着能够确定假设的单词串的部分何时对应于可能以话语出现的错误解码的词汇词或词汇外词。此功能在此处实现为基于似然比(LR)的假设测试程序,用于验证解码字符串中的单个单词。这里介绍了两种紫外线技术。第一个过程是根据与紫外线中使用的LR量度直接相关的优化标准在训练过程中估算紫外线模型的参数的过程。第二种技术是语音识别解码过程,其中“最佳”解码路径被定义为优化LR准则的路径。根据这些技术在公共交换电话网络上提高语音对话任务的紫外线性能的能力来评估。本文中提供的实验研究结果表明,基于LR的参数估计可显着改善此任务的UV性能。研究还发现,与基于LR标准训练的模型结合使用时,基于LR的解码程序的使用与现有的UV程序相比,可以将UV性能提高多达11%。最后,还发现LR解码器的性能高度依赖于LR准则在训练声学模型中的使用。本文针对UV的置信度度量的形成以及这些技术与ASR中使用的统计语言模型之间的相互作用进行了一些观察。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号