Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis

Weixun Gao; Qiying Cao

首页> 外文期刊>Journal of information science and engineering >Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis

【24h】

Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis

机译：基于HMM的语音合成中的说话人自适应频率弯曲

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker adaptation in speech synthesis transforms a source utterance to a target utterance that differs from the source in terms of voice characteristics. In this paper, we employ vocal tract length normalization, which is generally used in speech recognition to remove individual speaker characteristics, to speaker adaptation in speech synthesis. We propose a frequency warping approach based on a time-varying bilinear function to reduce the weighted spectral distance between the source speaker and the target speaker. The warped spectra of the source speaker are then converted to line spectrum pairs to train hidden Markov models (HMM). HMMs are further adapted by algorithms based on maximum likelihood linear regression with the target speaker's data. The experimental results show that our frequency warping approach can make the warped spectra of the source speaker closer to the target speaker, and the resultant adapted HMMs perform better than the HMMs trained by unwrapped spectra in terms of synthesized speech naturalness and speaker similarity.

机译：语音合成中的说话人适应将源话语转换为目标话语，该话语在语音特性方面不同于源。在本文中，我们采用声道长度归一化（通常用于语音识别中以消除单个说话者特征）来适应语音合成中的说话者。我们提出一种基于时变双线性函数的频率扭曲方法，以减少源说话者和目标说话者之间的加权频谱距离。然后，将源扬声器的扭曲频谱转换为线频谱对，以训练隐藏的马尔可夫模型（HMM）。通过基于最大似然线性回归和目标说话者数据的算法对HMM进行调整。实验结果表明，我们的频率扭曲方法可以使源说话者的扭曲频谱更接近目标说话者，并且在合成语音自然性和说话者相似性方面，所得到的自适应HMM的性能优于解包频谱训练的HMM。

著录项

来源
《Journal of information science and engineering》 |2014年第4期|1149-1166|共18页
作者
Weixun Gao; Qiying Cao;
展开▼
作者单位

School of Information Science and Technology Donghua University Shanghai, 200051 P.R. China;

School of Information Science and Technology Donghua University Shanghai, 200051 P.R. China,College of Computer Science and Technology Donghua University Shanghai, 200051 P.R. China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
frequency warping; VTLN; speaker adaptation; HMM-based speech synthesis; MLLR;

机译：频率扭曲VTLN;说话人适应基于HMM的语音合成;MLLR;

相似文献

外文文献
中文文献
专利

1. Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis [J] . John Dines, Hui Liang, Lakshmi Saheer, Computer speech and language . 2013,第2期

机译：个性化语音到语音翻译：基于HMM的语音合成的无监督跨语言说话者自适应
2. Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm [J] . Yamagishi J., Kobayashi T., Nakano Y., IEEE transactions on audio, speech and language processing . 2009,第1期

机译：基于HMM的语音合成的说话人自适应算法和约束SMAPLR自适应算法的分析
3. Hmm-based Style Control For Expressive Speech Synthesis With Arbitrary Speaker's Voice Using Model Adaptation [J] . Takashi NOSE, Makoto TACHIBANA, Takao KOBAYASHI IEICE Transactions on Information and Systems . 2009,第3期

机译：基于模型自适应的基于Hmm的风格控制，用于任意讲话者语音的表达性语音合成
4. Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation [C] . Viviane de Franca Oliveira, Sayaka Shiota, Yoshihiko Nankaku, Annual conference of the International Speech Communication Association . 2012

机译：基于感知特性和说话人插值的基于HMM语音合成的跨语言说话人自适应
5. Frequency warping by linear transformation, and vocal tract inversion for speaker normalization in automatic speech recognition. [D] . Panchapagesan, Sankaran. 2008

机译：通过线性变换实现的频率扭曲和声道反转，可在自动语音识别中实现说话人归一化。
6. One-against-All Weighted Dynamic Time Warping for Language-Independent and Speaker-Dependent Speech Recognition in Adverse Conditions [O] . Xianglilan Zhang, Jiping Sun, Zhigang Luo 2010

机译：不利条件下与语言无关和与说话者相关的语音识别的一对多加权动态时间规整
7. Speaker similarity evaluation of foreign-accented speech synthesis using HMM-based speaker adaptation [O] . Wester, M., Karhila, R. 2011

机译：基于Hmm的说话人适应的外语重音语音合成的说话人相似度评估

Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅