首页> 外国专利> Constructing Markov models of words from multiple utterances

Constructing Markov models of words from multiple utterances

机译:从多种话语构建词的马尔可夫模型

摘要

Speech recognition is improved by splitting each feneme string at a consistent point into a left portion and a right portion. The present invention addresses the problem of constructing fenemic baseforms which take into account variations in pronunciation of words from one utterance thereof to another. Specifically, the invention relates to a method of constructing a fenemic baseform for a word in a vocabulary of word segments including the steps of: (a) transforming multiple utterances of the word into respective strings of fenemes; (b) defining a set of fenemic Markov model phone machines; (c) determining the best single phone machine P.sub.1 for producing the multiple feneme strings; (d) determining the best two phone baseform of the form P.sub.1 P.sub.2 or P. sub.2 P.sub.1 for producing the multiple feneme strings; (e) aligning the best two phone baseform against each feneme string; (f) splitting each feneme string into a left portion and a right portion with the left portion corresponding to the first phone machine of the two phone baseform and the right portion corresponding to the second phone machine of the two phone baseform; (g) identifying each left portion as a left substring and each right portion as a right substring; (h) processing the set of left substrings and the set of right substrings in the same manner as the set of feneme strings corresponding to the multiple utterances including the further step of inhibiting further splitting of a substring when the single phone baseform thereof has a higher probability of producing the substring than does the best two phone baseform; and (k) concatenating the unsplit single phones in an order corresponding to the order of the feneme substrings to which they correspond.
机译:通过将一致点处的每个音位串分成左部分和右部分,可以改善语音识别。本发明解决了构造敌对基本形式的问题,该形式考虑了单词的发音从一种发音到另一种发音的变化。具体地,本发明涉及一种在单词片段的词汇表中为单词构造词义基础形式的方法,该方法包括以下步骤:(a)将单词的多种发音转换成相应的词素串; (b)定义一组恶意的马尔可夫模型电话机; (c)确定用于产生多个音位串的最佳单电话机P.sub.1; (d)确定形式为P.sub.1 P.sub.2或P.sub.2 P.sub.1的最佳两个电话基本形式,以产生多个音位字符串; (e)将最佳的两个电话基础与每个音位串对齐; (f)将每个音位串分成左部分和右部分,其中左部分对应于两个电话基础形式的第一电话机,而右部分对应于两个电话基础形式的第二电话机; (g)将每个左部分标识为左子串,并将每个右部分标识为右子串; (h)以与对应于多种话语的音素字符串组相同的方式处理左子字符串组和右子字符串组,包括进一步的步骤,即当单个电话基本格式较高时,禁止子字符串进一步分裂。产生子字符串的可能性比最好的两个电话基本形式大; (k)按照与它们对应的音素子串的顺序相对应的顺序来串联未分割的单个电话。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号