首页> 外文会议>International Conference on Language Resources and Evaluation >Open-source Multi-speaker Speech Corpora for Building Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu Speech Synthesis Systems
【24h】

Open-source Multi-speaker Speech Corpora for Building Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu Speech Synthesis Systems

机译:用于建造古吉拉特,坎卡达,马拉雅拉姆,马拉地山,泰米尔和泰卢固语音合成系统的开源多扬声器语音集团

获取原文

摘要

We present free high quality multi-speaker speech corpora for Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu, which are six of the twenty two official languages of India spoken by 374 million native speakers. The datasets are primarily intended for use in text-to-speech (TTS) applications, such as constructing multilingual voices or being used for speaker or language adaptation. Most of the corpora (apart from Marathi, which is a female-only database) consist of at least 2,000 recorded lines from female and male native speakers of the language. We present the methodological details behind corpora acquisition, which can be scaled to acquiring data for other languages of interest. We describe the experiments in building a multilingual text-to-speech model that is constructed by combining our corpora. Our results indicate that using these corpora results in good quality voices, with Mean Opinion Scores (MOS) > 3.6. for all the languages tested. We believe that these resources, released with an open-source license, and the described methodology will help in the progress of speech applications for the languages described and aid corpora development for other, smaller, languages of India and beyond.
机译:我们为Gujarati,Kannada,Malayalam,Marathi,Tamil和Telugu提供免费优质的多扬声器语音Corpora,这是由3.74亿母语人士讲的二十两种正式语言中的六个。数据集主要用于文本到语音(TTS)应用程序,例如构建多语言语言或用于扬声器或语言适应。大多数公司(除马拉地语)(除了女​​性数据库之外)包括至少2,000条来自女性和男性母语者的语言。我们介绍了Corpora收购背后的方法论细节,可以扩展到获取其他兴趣语言的数据。我们描述了构建通过组合我们的语料库构建的多语言文本到语音模型的实验。我们的结果表明,使用这些Corpora会导致优质的声音,具有平均意见分数(MOS)> 3.6。对于所有测试的语言。我们认为,这些资源以开源许可发布,以及所描述的方法将有助于描述所描述的语言的语音应用程序,并为印度及以后的其他,更小,语言的援助表达开发。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号