首页> 外文会议>2010 International Conference for Internet Technology and Secured Transactions >Statistical syllables selection approach for the preparation of Punjabi speech database
【24h】

Statistical syllables selection approach for the preparation of Punjabi speech database

机译:统计音节选择方法用于建立旁遮普语语音数据库

获取原文

摘要

This paper discusses the results of the statistical analysis of Punjabi syllables over a large Punjabi corpus. Syllables have been reported as good choice of speech unit for speech database of many languages. For this work also, syllables have been selected as the speech unit for the development of the Punjabi speech database. For minimizing the database size, efforts have been made for the selection of the minimal set of syllables covering almost whole Punjabi word set. For this all Punjabi syllables have been statistically analyzed on the Punjabi corpus having more than 104 million words. Interesting and very important results have been obtained from this analysis those helps to select a relatively smaller syllable set (about first ten thousand syllables (0.86% of total syllables)) of most frequently occurring syllables having cumulative frequency of occurrence (FOO) less than 99.81%, out of 1156740 total available syllables. Also to improve the efficiency of the text-to-speech (TTS) system; interesting facts about Punjabi syllables have been obtained based on their FOO at the three (starting, middle and end) positions in the words. indented
机译:本文讨论了大型旁遮普语料库中旁遮普音节的统计分析结果。音节已被报告为多种语言的语音数据库中语音单元的不错选择。对于这项工作,还选择了音节作为旁遮普语语音数据库开发的语音单元。为了最小化数据库大小,已经努力选择覆盖几乎整个旁遮普语单词集的最小音节集。为此,已对具有超过1.04亿个单词的旁遮普语料库的所有旁遮普音节进行了统计分析。从该分析中获得了有趣且非常重要的结果,这些结果有助于选择相对较小的音节集(大约前一万个音节(占总音节的0.86%)),并且其出现频率(FOO)小于99.81。 %,共1156740个可用音节。还提高了文本语音转换(TTS)系统的效率;根据旁遮普音节在单词中三个位置(开始,中间和结束)的FOO,已经获得了有趣的事实。缩进

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号