首页> 外文会议>2010 International Conference on Computational Intelligence and Security >Automatic Speech Corpus Construction from Broadcasting Speech Databases
【24h】

Automatic Speech Corpus Construction from Broadcasting Speech Databases

机译:从广播语音数据库自动构建语音语料库

获取原文

摘要

The speech corpus often needs to be constructed frequently for the diversified speech synthesis. This paper discusses our efforts on construction of speech corpus automatically from broadcasting speech databases for trainable Text-To-Speech (TTS) system. We present a new framework of automatic speech corpus construction from broadcasting speech databases. We select the clean speech audios from the broadcasting audios with a music detector which is based on speech/music discrimination. An automatic speech sentence segmentation system is used to generate the sentence database from the clean speech audios. At last, a text corpus construction method selects appropriate sentences speech which is maximizing the coverage of the sentence databaseȁ9;s diphones. Experiments show that our method can generate a good speech corpus rapidly with minimum manual intervention.
机译:为了多样化的语音合成,经常需要构建语音语料库。本文讨论了我们从可训练的文本语音转换(TTS)系统的广播语音数据库自动构建语音语料库的努力。我们从广播语音数据库中提出了一种自动语音语料库构建的新框架。我们使用基于语音/音乐辨别力的音乐检测器从广播音频中选择干净的语音音频。自动语音句子分割系统用于从干净的语音音频生成句子数据库。最后,一种语料库构建方法选择适当的句子语音,以最大程度地覆盖句子数据库9个双音素。实验表明,我们的方法能够以最少的人工干预快速生成良好的语音语料库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号