首页> 外文会议>International conference on multimodal interfaces >Two-Level Intercoupling HMM's For Speech Recognition
【24h】

Two-Level Intercoupling HMM's For Speech Recognition

机译:两级互动嗯,用于语音识别

获取原文

摘要

A great number of experimental results in Chinese speech recognition show that the performance of a recognizer can be improved remarkably if the possible impact of the tone on speech recognition is brought into consideration. It is self-evident that the tone can be perceived independent of syllables in the utterances of different syllables with the same tone pattern, and likewise, in the utterances of the same syllable with different tone patterns the syllable can be perceived separately as well. Accordingly the speech features of the syllable should be able to be separated from that of the tone because of their independence in auditory perception and phonetics. Virtually it is extremely difficult to extract the corresponding speech features out of speech signals using the existed algorithms. It results in that the speech representations commonly used in Chinese speech recognition contain the speech features of both syllables and tones. Consequently, models of new structure are required for Chinese speech recognition in which speech features of the syllable and the tone in a syllable with some tone pattern should be modeled by two unique models. One for syllable, the other for tone, and these models should intercouple. In this paper on the basis of the fundamental framework of HMM's the authors depicted the relations between syllable HMM's and tone ones by introducing the intercoupling probabilities and originally presented a kind of two-level intercoupling HMM's. Baum-Welch algorithm was developed to reestimate their parameters. In order to reduce the amount of computation in speech recognition several approaches were suggested to simplify recognition schemes. The proposed approach increased recognition accuracy from 86.7% to 92.2% on the training set and from 82.9% to 87.3% on an independent set of test data.
机译:在汉语语音识别中的大量实验结果表明,如果对语音在语音识别的可能影响是考虑的,则可以显着提高识别器的性能。不言而喻,可以在不同音调模式的不同音节的话语中独立于音节,同样地,在具有相同音节的话筒的话语中,可以单独地感知音节,并且可以单独地感知。因此,由于听觉感知和语音学的独立性,音节的语音特征应该能够与音调的独立分开。实际上,极难利用存在的算法从语音信号中提取相应的语音特征。它导致常用于中文语音识别的语音表示包含两个音节和音调的语音特征。因此,汉语语音识别需要新结构的模型,其中音节的语音特征和一个音节中的音节中的音调应由两个独特的型号建模。一个用于音节,另一个用于音调,这些模型应该是替补的。在本文的基础上,基于嗯,作者的基本框架通过引入互动概率来描绘了音节HMM和音调和音调之间的关系,并最初呈现了一种两级互补嗯。开发了BAUM-WELCH算法以重新定位其参数。为了减少语音识别中的计算量,建议若干方法来简化识别方案。拟议的方法在培训集中增加了86.7%至92.2%的识别准确性,在独立的一组测试数据上的82.9%至87.3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号