【24h】

TURKISH DICTATION SYSTEM FOR BROADCAST NEWS APPLICATIONS

机译:广播新闻应用的土耳其语指示系统

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We have designed a Turkish dictation system for Broadcast news applications. Turkish is an agglutinative language with free word order. These characteristics of the language result in the vocabulary explosion, large number of out-of-vocabulary (OOV) words and the complexity of the N-gram language models in speech recognition when words are used as recognition units. Therefore, we proposed new recognition units. We parsed some of the words to smaller recognition units like stems, endings and morphemes, and introduced these smaller units and the unparsed words to the speech recognizer as lexicon entries. This way, we were able to overcome to the problem of large number of OOV words with a moderate vocabulary size and get better estimates for the N-gram language models. However, best recognition result was obtained using the word-based language model.
机译:我们为广播新闻应用程序设计了土耳其语听写系统。土耳其语是一种凝集性语言,带有自由词序。语言的这些特征会导致词汇爆炸,大量语音外(OOV)单词以及将单词用作识别单元时语音识别中的N-gram语言模型的复杂性。因此,我们提出了新的识别单元。我们将某些单词解析为较小的识别单元,例如词干,结尾和语素,然后将这些较小的单元和未解析的单词作为词典条目引入语音识别器。这样,我们能够克服词汇量适中的大量OOV单词的问题,并获得对N-gram语言模型的更好估计。但是,使用基于单词的语言模型可获得最佳识别结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号