Vocal tract length normalization using linear transformation based on maximum likelihood estimation

Jun Rokui; Nakai Mitsuru; Hiroshi Shimodaira; Shigeki Sagayama

首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >Vocal tract length normalization using linear transformation based on maximum likelihood estimation

【24h】

Vocal tract length normalization using linear transformation based on maximum likelihood estimation

机译：基于最大似然估计的线性变换，声乐道长度归一化

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Vocal tract length normalization (VTLN) is one of the popular speaker adaptation techniques for speech recognition. The present study proposes a new VTLN algorithm in which expectation-maximization (EM) based parameter adaptation of HMM to vocal tract length is achieved in the mel-cepstral domain by utilizing a linear transformation model. Compared to other existing approaches based on hi-linear transformation for VTLN where a specific non-linear frequency warping function is employed in the spectrum domain and parameter adaptation of HMM is carried out in the cepstral domain, the proposed approach assumes a linear frequency warping with a single scaling factor and equivalent operation is modeled in the mel-cepstral domain by using a first order Taylor series approximation. The proposed scheme demonstrates significant improvement of recognition performance in a speaker independent word recognition task.

机译：声带长度归一化（VTLN）是语音识别的流行扬声器适应技术之一。本研究提出了一种新的VTLN算法，其中通过利用线性变换模型，在Mel-Cepstral域中在Mel-Cepstral域中实现了基于HMM的最大化（EM）的参数适应。与基于VTLN的VTLN的高线性变换的其他现有方法相比，其中在谱系统中采用特定的非线性频率翘曲功能和HMM的参数适应，所提出的方法采用线性频率翘曲通过使用一阶泰勒级近似，在Mel-Cepstral域中建模单个缩放因子和等效操作。该方案展示了扬声器独立词识别任务中识别性能的显着提高。

著录项

来源
《電子情報通信学会技術研究報告. 音声. Speech》 |2001年第522期|共6页
作者
Jun Rokui; Nakai Mitsuru; Hiroshi Shimodaira; Shigeki Sagayama;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类电报、传真;
关键词
Vocal tract length normalization; Linear transformation; Maximum likelihood estimation; Speaker adaptation; Speaker normalization;

机译：声带长度标准化;线性变换;最大似然估计;扬声器适应;扬声器标准化;

相似文献

外文文献
中文文献
专利

1. Vocal tract length normalization using linear transformation based on maximum likelihood estimation [J] . Jun Rokui, Nakai Mitsuru, Hiroshi Shimodaira, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2001,第520期

机译：基于最大似然估计的线性变换对声道长度进行归一化
2. Vocal tract length normalization using linear transformation based on maximum likelihood estimation [J] . Jun Rokui, Nakai Mitsuru, Hiroshi Shimodaira, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2001,第520期

机译：基于最大似然估计的线性变换，声乐道长度归一化
3. Vocal tract length normalization using linear transformation based on maximum likelihood estimation [J] . Jun Rokui, Nakai Mitsuru, Hiroshi Shimodaira, 電子情報通信学会技術研究報告. 音声. Speech . 2001,第522期

机译：基于最大似然估计的线性变换，声乐道长度归一化
4. Rapid Vocal Tract Length Normalization using Maximum Likelihood Estimation [C] . Tadashi Emori, Koichi Shinoda European conference on speech communication and technology . 2001

机译：利用最大似然估计快速声乐道长度归一化
5. Frequency warping by linear transformation, and vocal tract inversion for speaker normalization in automatic speech recognition. [D] . Panchapagesan, Sankaran. 2008

机译：通过线性变换实现的频率扭曲和声道反转，可在自动语音识别中实现说话人归一化。
6. Performance Comparison of Various Maximum Likelihood Nonlinear Mixed-Effects Estimation Methods for Dose–Response Models [O] . Elodie L. Plan, Alan Maloney, France Mentré, 2012

机译：剂量反应模型的各种最大似然非线性混合效应估计方法的性能比较
7. Combining vocal tract length normalization with hierarchial linear transformations [O] . Saheer, L., Yamagishi, J., Garner, P.N., 2012

机译：将声道长度归一化与分层线性变换相结合

Vocal tract length normalization using linear transformation based on maximum likelihood estimation

摘要

著录项

相似文献

相关主题

期刊订阅