首页> 外文会议>International Symposium on Natural Language Processing >A Real-time Thai Speech Synthesizer on a Mobile Device
【24h】

A Real-time Thai Speech Synthesizer on a Mobile Device

机译:移动设备上的实时泰式语音合成器

获取原文

摘要

Several Thai TTS systems are already available on a resourceful platform such as a personal computer. However, porting these systems to a resource limited device such as a mobile phone is not an easy task. Practical aspects including application size and processing time have to be concerned. In this paper, we aim at developing a Thai speech synthesizer that can produce an output speech in real-time on a mobile device. Our synthesizer is based on Flite, an open source synthesis library developed by Carnegie Mellon University. Flite is suitable for a limited resource device as it is both small and fast To use Flite as a text-to-speech engine for Thai, many components have to be modified. First, a word segmentation component and a Thai pronunciation dictionary are added to determine word boundaries and the pronunciation of each word in Thai input text. To minimize the resource, a simple word segmentation algorithm, a longest matching, is employed. Next, to handle the tones in Thai, we integrate tones with phones and define a tonal phone set for Thai. Lastly, a small Thai speech database is essential. For this, we transform a unit selection database into a diphone database by selecting only necessary diphones. We conducted an experiment to compare our speech synthesizer with pTalk, an HMM-based speech synthesizer, both in terms of speed and sound quality measured by a subjective listening test. While the quality of our output speech may not be as good as the output from pTalk, our system is much faster and more stable than pTalk.
机译:几个泰国TTS系统已经在诸如个人计算机之类的高兴的平台上提供。但是,将这些系统移植到资源限制设备,例如移动电话不是一件容易的任务。包括应用程序规模和处理时间的实用方面必须涉及。在本文中,我们的目标是开发泰式语音合成器,可以在移动设备上实时生产输出语音。我们的合成器基于Flite,由Carnegie Mellon University开发的开源综合图书馆。 FLITE适用于有限的资源设备,因为它既小又快速使用涡旋作为泰式的语音引擎,必须修改许多组件。首先,添加单词分割组件和泰语语音字典以确定Word边界和泰语输入文本中的每个单词的发音。为了最小化资源,采用简单的单词分割算法,是最长的匹配。接下来,要处理泰语的音调,我们将音调与手机集成并定义泰国的音调电话。最后,一个小的泰国语音数据库至关重要。为此,我们通过仅选择必要的Diphones将单元选择数据库转换为Diphone数据库。我们进行了一个实验,将我们的语音合成器与Ptalk,一种基于HMM的语音合成器进行了比较,无论是通过主观聆听测试测量的速度和音质。虽然我们的产出语音的质量可能与Ptalk的输出不一样好,但我们的系统比Ptalk更快,更稳定。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号