【24h】

Learning Phrase Break Detection in Thai Text-to-Speech

机译:泰语语音转换中的学习短语中断检测

获取原文
获取原文并翻译 | 示例

摘要

One of the crucial problems in developing high quality Thai text-to-speech synthesis is to detect phrase break from Thai texts. Unlike English, Thai has no word boundary delimiter and no punctuation mark at the end of a sentence. It makes the problem more serious. Because when we detect phrase break incorrectly, it is not only producing unnatural speech but also creating the wrong meaning. In this paper, we apply machine learning algorithms namely C4.5 and RIPPER in detecting phrase break. These algorithms can learn useful features for locating a phrase break position. The features which are investigated in our experiments are collocations in different window sizes and the number of syllables before and after a word in question to a phrase break position. We compare the results from C4.5 and RIPPER with a based-line method (Part-of-Speech sequence model). The experiment shows that C4.5 and RIPPER appear to outperform the based-line method and RIPPER performs better accuracy results than C4.5.
机译:开发高质量的泰语文本到语音合成的关键问题之一是检测泰语文本中的短语中断。与英语不同,泰语在单词末尾没有单词边界定界符,也没有标点符号。这使问题更加严重。因为当我们错误地检测到短语断开时,它不仅会产生不自然的语音,而且还会产生错误的含义。在本文中,我们将C4.5和RIPPER等机器学习算法用于检测词组中断。这些算法可以学习有用的功能来定位词组中断位置。在我们的实验中研究的特征是在不同的窗口大小中的搭配以及在所讨论的单词到短语中断位置之前和之后的音节数量。我们将C4.5和RIPPER的结果与基于基线的方法(词性序列模型)进行比较。实验表明,C4.5和RIPPER的性能似乎优于基线方法,并且RIPPER的准确度结果优于C4.5。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号