首页> 外文期刊>Journal of VLSI signal processing systems for signal, image, and video technology >Pitch-Scaled Spectrum Based Excitation Model for HMM-based Speech Synthesis
【24h】

Pitch-Scaled Spectrum Based Excitation Model for HMM-based Speech Synthesis

机译:基于音高的频谱激励基于HMM的语音合成模型

获取原文
获取原文并翻译 | 示例
           

摘要

The speech generated by hidden Markov model (HMM)-based speech synthesis systems (HTS) suffers from a 'buzzing' sound, which is due to an over-simplified vocoding technique. This paper proposes a new excitation model that uses a pitch-scaled spectrum for the parametric representation of speech in HTS. A residual signal produced using inverse filtering retains the detailed harmonic structure of speech that is not part of the linear prediction (LP) spectrum. By using pitch-scaled spectrums, we can compensate the LP spectrum using the detailed harmonic structure of the residual signal. This spectrum can be compressed using a periodic excitation parameter so that it can used to train HTS. We define an aperiodic measure as the harmonics-to-noise ratio, and calculate a voicing-cut off frequency to fit the aperiodic measure to a sigmoid function. We combine the LP coefficient, pitch-scaled spectrum, and sigmoid function to create a new parametric representation of speech. Listening tests were carried out to evaluate the effectiveness of the proposed technique. This vocoder received a mean opinion score of 4.0 in analysis-synthesis experiments, before dimensionality reduction. By integrating this vocoder into HTS, we improved the sound of the synthesized speech compared with the pulse train excitation model, and demonstrated an even better result than STRAIGHT-HTS.
机译:基于隐马尔可夫模型(HMM)的语音合成系统(HTS)生成的语音遭受“嗡嗡”声,这是由于语音编码技术过于简化所致。本文提出了一种新的激励模型,该模型将音调缩放频谱用于HTS中语音的参数表示。使用逆滤波产生的残差信号保留了语音的详细谐波结构,这不是线性预测(LP)频谱的一部分。通过使用音调标度频谱,我们可以使用残差信号的详细谐波结构来补偿LP频谱。可以使用周期性激励参数来压缩该频谱,以便可以将其用于训练HTS。我们将非周期性量度定义为谐波噪声比,并计算一个音调截止频率,以使非周期性量度适合S型函数。我们结合LP系数,音高缩放频谱和S形函数来创建语音的新参数表示。进行听力测试以评估所提出技术的有效性。在降维之前,该声码器在分析综合实验中获得的平均意见得分为4.0。通过将此声码器集成到HTS中,与脉冲序列激励模型相比,我们改善了合成语音的声音,并证明了比STRAIGHT-HTS更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号