...
首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >Vocal tract length normalization using linear transformation based on maximum likelihood estimation
【24h】

Vocal tract length normalization using linear transformation based on maximum likelihood estimation

机译:基于最大似然估计的线性变换,声乐道长度归一化

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Vocal tract length normalization (VTLN) is one of the popular speaker adaptation techniques for speech recognition. The present study proposes a new VTLN algorithm in which expectation-maximization (EM) based parameter adaptation of HMM to vocal tract length is achieved in the mel-cepstral domain by utilizing a linear transformation model. Compared to other existing approaches based on hi-linear transformation for VTLN where a specific non-linear frequency warping function is employed in the spectrum domain and parameter adaptation of HMM is carried out in the cepstral domain, the proposed approach assumes a linear frequency warping with a single scaling factor and equivalent operation is modeled in the mel-cepstral domain by using a first order Taylor series approximation. The proposed scheme demonstrates significant improvement of recognition performance in a speaker independent word recognition task.
机译:声带长度归一化(VTLN)是语音识别的流行扬声器适应技术之一。 本研究提出了一种新的VTLN算法,其中通过利用线性变换模型,在Mel-Cepstral域中在Mel-Cepstral域中实现了基于HMM的最大化(EM)的参数适应。 与基于VTLN的VTLN的高线性变换的其他现有方法相比,其中在谱系统中采用特定的非线性频率翘曲功能和HMM的参数适应,所提出的方法采用线性频率翘曲 通过使用一阶泰勒级近似,在Mel-Cepstral域中建模单个缩放因子和等效操作。 该方案展示了扬声器独立词识别任务中识别性能的显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号