【2h】

Training and search methods for speech recognition.

机译:语音识别的训练和搜索方法。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Speech recognition involves three processes: extraction of acoustic indices from the speech signal, estimation of the probability that the observed index string was caused by a hypothesized utterance segment, and determination of the recognized utterance via a search among hypothesized alternatives. This paper is not concerned with the first process. Estimation of the probability of an index string involves a model of index production by any given utterance segment (e.g., a word). Hidden Markov models (HMMs) are used for this purpose [Makhoul, J. & Schwartz, R. (1995) Proc. Natl. Acad. Sci. USA 92, 9956-9963]. Their parameters are state transition probabilities and output probability distributions associated with the transitions. The Baum algorithm that obtains the values of these parameters from speech data via their successive reestimation will be described in this paper. The recognizer wishes to find the most probable utterance that could have caused the observed acoustic index string. That probability is the product of two factors: the probability that the utterance will produce the string and the probability that the speaker will wish to produce the utterance (the language model probability). Even if the vocabulary size is moderate, it is impossible to search for the utterance exhaustively. One practical algorithm is described [Viterbi, A. J. (1967) IEEE Trans. Inf. Theory IT-13, 260-267] that, given the index string, has a high likelihood of finding the most probable utterance.
机译:语音识别涉及三个过程:从语音信号中提取声学索引,估计观察到的索引串由假设的发声段引起的概率以及通过在假设的替代方法中进行搜索来确定已识别的发声。本文与第一个过程无关。索引字符串的概率的估计涉及通过任何给定发音段(例如,单词)的索引产生的模型。隐藏的马尔可夫模型(HMM)用于此目的[Makhoul,J.&Schwartz,R.(1995)Proc。 Natl。学院科学美国,92,9956-9963]。它们的参数是状态转换概率和与转换关联的输出概率分布。本文将描述通过语音数据的连续重新估计从语音数据中获取这些参数的值的Baum算法。识别器希望找到可能导致观察到的声学索引字符串的最可能发声。该概率是两个因素的乘积:发声将产生字符串的概率和说话者希望发声的概率(语言模型概率)。即使词汇量适中,也不可能详尽地搜索话语。描述了一种实用的算法[Viterbi,A.J。(1967)IEEE Trans。 Inf。理论IT-13,260-267]在给定索引字符串的情况下,很有可能找到最可能的话语。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号