首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >Vocal tract length normalization using linear transformation based on maximum likelihood estimation
【24h】

Vocal tract length normalization using linear transformation based on maximum likelihood estimation

机译:基于最大似然估计的线性变换对声道长度进行归一化

获取原文
获取原文并翻译 | 示例
           

摘要

Vocal tract length normalization (VTLN) is one of the popular speaker adaptation techniques for speech recognition. The present study proposes a new VTLN algorithm in which expectation-maximization (EM) based parameter adaptation of HMM to vocal tract length is achieved in the mel-cepstral domain by utilizing a linear transformation model. Compared to other existing approaches based on hi-linear transformation for VTLN where a specific non-linear frequency warping function is employed in the spectrum domain and parameter adaptation of HMM is carried out in the cepstral domain, the proposed approach assumes a linear frequency warping with a single scaling factor and equivalent operation is modeled in the mel-cepstral domain by using a first order Taylor series approximation. The proposed scheme demonstrates significant improvement of recognition performance in a speaker independent word recognition task.
机译:声道长度归一化(VTLN)是用于语音识别的流行的说话者自适应技术之一。本研究提出了一种新的VTLN算法,其中通过利用线性变换模型在mel-倒谱域中实现了基于期望最大化(EM)的HMM对声道长度的参数自适应。与其他基于VTLN的基于线性变换的现有方法相比,在频谱域中使用特定的非线性频率弯曲函数,并在倒频谱域中进行HMM的参数自适应,相比于现有方法,该方法假设具有通过使用一阶泰勒级数逼近,在梅尔倒谱域中对单个比例因子和等效运算进行建模。所提出的方案证明了在说话者无关的单词识别任务中识别性能的显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号