首页> 外文期刊>Computer Speech and Language >Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression
【24h】

Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression

机译:语音合成中的类比音素转录:新颖的单词发音和词典压缩

获取原文
获取原文并翻译 | 示例

摘要

The synthesis of speech from unrestricted text needs a phonemic transcription including syllabification and lexical stress for each word and symbol. Speech synthesizers currently use large lexicons to provide such transcriptions, but not every word has a lexical entry and a backup is required to produce transcriptions for novel words. In addition, synthesizers do not have an infinite amount of memory at their disposal, so it is not always possible continually to append supplementary lexemes for specialized applications in the hope of reducing the probability of encountering a novel word. Transcriptions for novel words are produced by implicit analogy with an existing lexicon. A data-driven technique of extracting context-dependent grapheme-to-phoneme rules with dynamically minimized context lengths from a training lexicon is proposed. Syllable boundary and lexical stress information is included in the transcriptions. The proposed system satisfies certain pragmatic constraints: it can produce transcriptions with sufficient rapidity to maintain real-time processing in a text-to-speech system; the rules take up a small amount of storage size (370 KBytes); and a pronunciation can be generated for any novel word. The quality of the transcription process enables 77.06% of lexemes formerly present in the training lexicon to be excluded, thus reducing the lexicon's memory requirements by 74.18% (of 3.57 MBytes).
机译:从不受限制的文本中合成语音需要音素转录,包括每个单词和符号的音节化和词法重音。语音合成器目前使用大型词典来提供此类转录,但并非每个单词都具有词汇条目,并且需要备份来生成新单词的转录。另外,合成器没有可供使用的无限数量的存储器,因此,不可能总是不断地为特殊应用附加附加的词素,以期减少遇到一个新单词的可能性。新单词的转录是通过与现有词典的隐式类比产生的。提出了一种数据驱动的技术,该技术从训练词典中提取上下文相关的音素到音素规则,并具有动态最小化的上下文长度。音节边界和词法重音信息包含在抄录中。拟议的系统满足某些实际的约束条件:它可以以足够快的速度产生转录,以保持文本到语音系统中的实时处理;规则占用少量存储空间(370 KB);任何新词都可以产生发音。转录过程的质量可以将以前存在于训练词典中的77.06%的词素排除在外,从而将词典的内存需求降低了74.18%(即3.57 MB)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号