首页> 外文会议> >Speaker normalization and adaptation based on linear transformation
【24h】

Speaker normalization and adaptation based on linear transformation

机译:基于线性变换的说话人归一化和自适应

获取原文

摘要

We propose novel speaker independent (SI) modeling and speaker adaptation based on a linear transformation. An SI model and speaker dependent (SD) models are usually generated using the same preprocessing of acoustic data. This straightforward preprocessing causes a serious problem. Probability distributions of the SI models become broad and the SI models do not give good initial estimates for speaker adaptation. To solve these problems, a normalized SI model is generated by removing speaker characteristics using a shift vector obtained by the maximum likelihood linear regression (MLLR) technique. In addition, we propose a speaker adaptation method that combines the MLLR and maximum a posteriori (MAP) techniques from the normalized SI model. Experiments have been performed on Japanese phoneme recognition test using continuous density mixture Gaussian HMMs. For the baseline recognition test of normalized SI model, a 12.8% reduction of the phoneme recognition error rate compared to the conventional SI model was achieved. Furthermore the proposed adaptation method using the normalized SI model was more effective than the tested conventional method regardless the amount of adaptation data.
机译:我们提出了新颖的说话人独立(SI)建模和基于线性变换的说话人自适应方法。通常使用相同的声学数据预处理来生成SI模型和与说话者相关的(SD)模型。这种直接的预处理会引起严重的问题。 SI模型的概率分布变得很广泛,并且SI模型没有为说话者适应提供良好的初始估计。为了解决这些问题,通过使用通过最大似然线性回归(MLLR)技术获得的移位向量消除说话者特征,从而生成归一化的SI模型。此外,我们提出了一种说话人自适应方法,该方法将MLLR和最大后验(MAP)技术结合到归一化SI模型中。使用连续密度混合高斯HMM对日本音素识别测试进行了实验。对于归一化SI模型的基线识别测试,与常规SI模型相比,音素识别错误率降低了12.8%。此外,无论适应数据量如何,使用归一化SI模型的拟议适应方法都比经过测试的常规方法更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号