首页> 外文会议>IEEE International Conference on Acoustics, Speech, and Signal Processing >MULTI-DISTRIBUTION DEEP BELIEF NETWORK FOR SPEECH SYNTHESIS
【24h】

MULTI-DISTRIBUTION DEEP BELIEF NETWORK FOR SPEECH SYNTHESIS

机译:语音合成多分布深度信仰网络

获取原文

摘要

Deep belief network (DBN) has been shown to be a good generative model in tasks such as hand-written digit image generation. Previous work on DBN in the speech community mainly focuses on using the generatively pre-trained DBN to initialize a discriminative model for better acoustic modeling in speech recognition (SR). To fully utilize its generative nature, we propose to model the speech parameters including spectrum and FO simultaneously and generate these parameters from DBN for speech synthesis. Compared with the predominant HMM-based approach, objective evaluation shows that the spectrum generated from DBN has less distortion. Subjective results also confirm the advantage of the spectrum from DBN, and the overall quality is comparable to that of context-independent HMM.
机译:深度信仰网络(DBN)已被证明是诸如手写的数字图像生成的任务中的良好生成模型。在语音界中的DBN上的先前工作主要侧重于使用一般性预先训练的DBN来初始化语音识别中更好的声学建模的判别模型(SR)。为了充分利用其生成性质,我们建议将包括频谱和FO的语音参数进行模拟,并从DBN生成这些参数,用于语音合成。与主要的肝脏基础方法相比,客观评价表明,从DBN产生的光谱具有较小的失真。主观结果还确认了DBN的光谱的优势,并且整体质量与上下文的HMM相当。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号