首页> 外文会议> >Fast speaker adaptation combined with soft vector quantization in an HMM speech recognition system
【24h】

Fast speaker adaptation combined with soft vector quantization in an HMM speech recognition system

机译:HMM语音识别系统中的快速说话人自适应与软矢量量化相结合

获取原文

摘要

The authors describe a method for combining speaker adaptation by feature vector transformation with semi-continuous hidden Markov modeling (SCHMM). Since the reference speaker's voice is represented in the SCHMM system by multidimensional Gaussian distributions, it is these distributions rather than feature vectors that must be transformed. The performance of hard-decision vector quantization (HVQ), soft-decision VQ (SVQ), and SCHMM are compared as are the speaker-adaptive and speaker-independent systems. In addition, the influence of dynamic features is investigated. The definition of subword units is optimized, and, with respect to full or diagonal covariance matrices and codebook size, the SCHMM system is optimized. Model initialization and distribution reestimation during training is introduced. Significant improvements are obtained compared to previously reported systems based on HVQ: from 71.6% to 84.6% (speaker-independent) and from 80.4% to 87.4% (speaker-adaptive) mean recognition rate under difficult conditions.
机译:作者描述了一种通过特征向量变换将说话人自适应与半连续隐马尔可夫建模(SCHMM)相结合的方法。由于参考说话人的声音在SCHMM系统中是通过多维高斯分布表示的,因此必须转换这些分布而不是特征向量。比较了硬决策矢量量化(HVQ),软决策VQ(SVQ)和SCHMM的性能,以及说话人自适应和独立于说话者的系统。此外,还研究了动态特征的影响。优化子字单元的定义,并且针对完整或对角协方差矩阵和码本大小,对SCHMM系统进行优化。介绍了训练期间的模型初始化和分布重新估计。与以前基于HVQ的报告系统相比,获得了显着改进:在困难条件下的平均识别率从71.6%提高到84.6%(独立于说话者),从80.4%提高到87.4%(独立于说话者)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号