...
首页> 外文期刊>Circuits, systems and signal processing >Rapid Speaker Adaptation Based on Combination of KPCA and Latent Variable Model
【24h】

Rapid Speaker Adaptation Based on Combination of KPCA and Latent Variable Model

机译:基于KPCA和潜变模型的组合快速扬声器适应

获取原文
获取原文并翻译 | 示例

摘要

Speaker adaptation is implemented in order to shift the speaker-independent model closer to the new speaker speech characteristics to improve the speech recognition performance. The kernel eigenspace-based speaker adaptation methods provide satisfactory performance using only a small amount of adaptation data. In such adaptation methods, kernel principal component analysis (KPCA) is applied to the training speaker space in order to create kernel eigenspace. Then, the adapted acoustic model to the new user is calculated in that space. One limitation of KPCA is its inability to define a precise pre-image of the model adapted in the kernel eigenspace, back to the speaker space. Therefore, a huge amount of computations is required to perform adaptation. The previously developed solutions for calculation of an approximate pre-image of the adapted model do not necessarily lead to the optimal conditions. Therefore, in this paper, we propose an efficient solution for this problem to construct more reliable pre-image of the adapted model in the speaker space. For this purpose, we benefit from the latent variable model to define a probabilistic model for description of the applied mapping between the kernel eigenspace and the speaker space. The experiments were conducted on two speech databases: FARSDAT, a Persian, and TIMIT, an English speech database. Implementing a typical HMM-based automatic speech recognition system, it was verified that the proposed method, utilizing about three seconds of adaptation data, achieves up to 4.4% and 7.6% relative phoneme recognition accuracy rate over the speaker-independent model on FARSDAT and TIMIT, respectively. Moreover, the proposed approach demonstrated superior performance compared to the other kernel eigenspace-based adaptation methods.
机译:实施扬声器适应以使扬声器的独立模型更接近新的扬声器语音特性来提高语音识别性能。基于内核的基于EIGenspace的扬声器适配方法使用少量适应数据提供满意的性能。在这种适应方法中,内核主成分分析(KPCA)应用于训练扬声器空间,以创建内核EIGenspace。然后,在该空间中计算对新用户的适应声模型。 KPCA的一个限制是它无法定义在内核eIGenspace中适应的模型的精确预图像,然后回到扬声器空间。因此,需要大量的计算来执行适应。以前开发的用于计算适应模型的近似图像的解决方案不一定导致最佳条件。因此,在本文中,我们提出了一种有效的解决方案,用于该问题以构建扬声器空间中适应模型的更可靠的图像。为此目的,我们受益于潜在变量模型,以定义概率模型,用于描述内核EIGenspace和扬声器空间之间的应用映射。实验是在两个语音数据库中进行:Farsdat,Persian和Timit,英文语音数据库。实现基于典型的基于HMM的自动语音识别系统,验证了利用大约三秒的适应数据,在Farsdat和Timit上的扬声器 - 独立模型上实现了高达4.4%和7.6%的相对音素识别精度率高达了4.4%和7.6%的方法, 分别。此外,与其他基于内核的基于核心空间的适应方法相比,所提出的方法表现出优越的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号