首页> 外文会议>International Conference on Bangla Speech and Language Processing >A Sequence-to-Sequence Pronunciation Model for Bangla Speech Synthesis
【24h】

A Sequence-to-Sequence Pronunciation Model for Bangla Speech Synthesis

机译:孟加拉语音合成的序列到序列语音模型

获取原文

摘要

Extracting pronunciation from written text is necessary in many application areas, especially in text-to-speech synthesis. `Bangla' is not completely a phonetic language, meaning there is not always direct mapping from orthography to pronunciation. It mainly suffers from `schwa deletion' problem, along with some other ambiguous letters and conjuncts. Rule-based approaches cannot completely solve this problem. In this paper, we propose to adopt an Encoder-Decoder based neural machine translation (NMT) model for determining pronunciations of Bangla words. We mapped the pronunciation problem into a sequence-to-sequence problem and used two `Gated Recurrent Unit Recurrent Neural Network's (GRU-RNNs) for our model. We fed the model with two types of input data. In one model we used `raw' words and in other model we used `pre-processed' words (normalized by hand-written rules) as input. Both experiments showed promising results and can be used in any practical application.
机译:在许多应用领域,尤其是在文本到语音合成中,从书面文本中提取语音是必要的。 “孟加拉语”并不完全是一种语音语言,这意味着从拼字法到发音并非总是直接映射。它主要遭受“ schwa delete”问题,以及其他一些模棱两可的字母和连词的困扰。基于规则的方法不能完全解决此问题。在本文中,我们建议采用基于编码器-解码器的神经机器翻译(NMT)模型来确定孟加拉语单词的发音。我们将发音问题映射到一个序列到序列的问题,并为我们的模型使用了两个“门控递归单元递归神经网络”(GRU-RNN)。我们为模型提供了两种类型的输入数据。在一个模型中,我们使用“原始”词,而在另一模型中,我们使用“预处理”词(通过手写规则进行标准化)作为输入。两项实验均显示出令人鼓舞的结果,可以在任何实际应用中使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号