首页> 外文会议>Neural and Stochastic Methods in Image and Signal Processing >Combining LVQ with continuous-density hidden Markov models in speech recognition
【24h】

Combining LVQ with continuous-density hidden Markov models in speech recognition

机译:将LVQ与连续密度隐马尔可夫模型相结合进行语音识别

获取原文
获取原文并翻译 | 示例

摘要

Abstract: We propose the use of self-organizing maps (SOMs) and learning vector quantization (LVQ) as an initialization method for the training of the continuous observation density hidden Markov models (CDHMMs). We apply CDHMMs to model phonemes in the transcription of speech into phoneme sequences. The Baum-Welch maximum likelihood estimation method is very sensitive to the initial parameter values if the observation densities are represented by mixtures of many Gaussian density functions. We suggest the training of CDHMMs to be done in two phases. First the vector quantization methods are applied to find suitable placements for the means of Gaussian density functions to represent the observed training data. The maximum likelihood estimation is then used to find the mixture weights and state transition probabilities and to re-estimate the Gaussians to get the best possible models. The result of initializing the means of distributions by SOMs or LVQ is that good recognition results can be achieved using essentially fewer Baum-Welch iterations than are needed with random initial values. Also, in the segmental K-means algorithm the number of iterations can be remarkably reduced with a suitable initialization. We experiment, furthermore, to enhance the discriminatory power of the phoneme models by adaptively training the state output distributions using the LVQ-algorithm. !11
机译:摘要:我们建议使用自组织映射(SOM)和学习矢量量化(LVQ)作为初始化方法,用于训练连续观察密度隐藏马尔可夫模型(CDHMM)。我们将CDHMM应用于将语音转录成音素序列的音素模型。如果观察密度由许多高斯密度函数的混合表示,则Baum-Welch最大似然估计方法对初始参数值非常敏感。我们建议CDHMM的培训分两个阶段进行。首先,应用矢量量化方法来找到适合高斯密度函数的位置,以表示观察到的训练数据。然后,将最大似然估计用于找到混合权重和状态转换概率,并重新估计高斯,以获得最佳可能模型。通过SOM或LVQ初始化分布均值的结果是,与随机初始值相比,使用本质上更少的Baum-Welch迭代可以实现良好的识别结果。同样,在分段K-均值算法中,通过适当的初始化可以显着减少迭代次数。我们还尝试通过使用LVQ算法自适应地训练状态输出分布来增强音素模型的鉴别能力。 !11

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号