首页> 外文期刊>IEEE Transactions on Signal Processing >Isolated-utterance speech recognition using hidden Markov models with bounded state durations
【24h】

Isolated-utterance speech recognition using hidden Markov models with bounded state durations

机译:使用有界状态持续时间的隐马尔可夫模型进行隔离话语语音识别

获取原文
获取原文并翻译 | 示例

摘要

Hidden Markov models (HMMs) with bounded state durations (HMM/BSD) are proposed to explicitly model the state durations of HMMs and more accurately consider the temporal structures existing in speech signals in a simple, direct, but effective way. A series of experiments have been conducted for speaker dependent applications using 408 highly confusing first-tone Mandarin syllables as the example vocabulary. It was found that in the discrete case the recognition rate of HMM/BSD (78.5%) is 9.0%, 6.3%, and 1.9% higher than the conventional HMMs and HMMs with Poisson and gamma distribution state durations, respectively. In the continuous case (partitioned Gaussian mixture modeling), the recognition rates of HMM/BSD (88.3% with 1 mixture, 88.8% with 3 mixtures, and 89.4% with 5 mixtures) are 6.3%, 5.0%, and 5.5% higher than those of the conventional HMMs, and 5.9% (with 1 mixture), 3.9% (with 3 mixtures) and 3.1% (with 1 mixture), 1.8% (with 3 mixtures) higher than HMMs with Poisson and gamma distributed state durations, respectively.
机译:提出了具有有限状态持续时间(HMM / BSD)的隐马尔可夫模型(HMM),以对HMM的状态持续时间进行显式建模,并以一种简单,直接但有效的方式更准确地考虑语音信号中存在的时间结构。使用408个高度混淆的第一声普通话音节作为示例词汇,已经针对依赖于说话者的应用程序进行了一系列实验。发现在离散情况下,HMM / BSD的识别率(78.5%)分别比传统的HMM和具有Poisson和gamma分布状态持续时间的HMM高9.0%,6.3%和1.9%。在连续情况下(分区高斯混合建模),HMM / BSD的识别率(1种混合物为88.3%,3种混合物为88.8%,5种混合物为89.4%)分别比6.3%,5.0%和5.5%高分别比具有Poisson和伽玛分布状态持续时间的HMM高5.9%(带有1种混合物),3.9%(带有3种混合物)和3.1%(带有3种混合物),1.8%(带有3种混合物) 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号