首页> 外文会议>Insternational Joint Conference on Natural Language Processing >High Speed Unknown Word Prediction Using Support Vector Machine For Chinese Text-to-Speech Systems
【24h】

High Speed Unknown Word Prediction Using Support Vector Machine For Chinese Text-to-Speech Systems

机译:使用支持向量机用于中文文本到语音系统的高速未知字预测

获取原文

摘要

One of the most significant problems in POS (Part-of-Speech) tagging of Chinese texts is an identification of words in a sentence, since there is no blank to delimit the words. Because it is impossible to pre-register all the words in a dictionary, the problem of unknown words inevitably occurs during this process. Therefore, the unknown word problem has remarkable effects on the accuracy of the sound in Chinese TTS (Text-to-Speech) system. In this paper, we present a SVM (support vector machine) based method that predicts the unknown words for the result of word segmentation and tagging. For high speed processing to be used in a TTS, we pre-detect the candidate boundary of the unknown words before starting actual prediction. Therefore we perform a two-phase unknown word prediction in the steps of detection and prediction. Results of the experiments are very promising by showing high precision and high recall with also high speed.
机译:POS(演讲的一部分)标记中文文本中最重要的问题之一是句子中的单词的识别,因为没有空白来限制单词。 因为不可能在字典中预先登记所有单词,所以在此过程中不可避免地发生未知词的问题。 因此,未知的单词问题对中国TTS(文本到语音)系统中声音的准确性具有显着影响。 在本文中,我们介绍了一种基于SVM(支持向量机)的方法,该方法预测Word分割和标记的结果。 对于在TTS中使用的高速处理,我们在开始实际预测之前预先检测未知单词的候选边界。 因此,我们在检测和预测步骤中执行两相未知的单词预测。 通过表现出高精度和高速召回的实验结果非常有前景。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号