首页> 外文期刊>Indian Journal of Science and Technology >Language Models Creation for the Tatar Speech Recognition System
【24h】

Language Models Creation for the Tatar Speech Recognition System

机译:塔塔尔语语音识别系统的语言模型创建

获取原文
           

摘要

Objectives: The article presents the experiments on the creation of different language models for the Tatar language. N-gram statistical models are used with five different smoothing techniques. Methods: These models can be used in various applications: machine translation systems, spell checking, etc. The study intended to use the patterns in the system of Tatar speech automatic recognition. Taking into account the specifics of the Tatar language, consisting in a rich morphology, speech recognition systems may use not only words but also the building blocks of words as basic modeling units: syllables, morphemes, etc Finding: The following essential elements were chosen for a complete analysis of Tatar language models development: word, morpheme, morph (statistically selected component of a nutshell), the stem and affix chain, syllable and letter. Thus, some models constructed for all combinations of 2-, 3-, 4-grams, smoothing techniques and essential elements of the language. Besides, an experiment showing the possibility of a language model development based on word classes conducted. Conclusion: According to performed experiment results the conclusions are made about the quality of the Tatar language grammar description, the degree of coverage lexicon, and required vocabulary volume for each type of constructed models.
机译:目标:本文介绍了针对塔塔尔语言创建不同语言模型的实验。 N-gram统计模型与五种不同的平滑技术一起使用。方法:这些模型可用于各种应用程序:机器翻译系统,拼写检查等。该研究旨在使用Tatar语音自动识别系统中的模式。考虑到塔塔尔语言的特殊性(包含丰富的形态),语音识别系统不仅可以使用单词,还可以使用单词的构造块作为基本建模单位:音节,语素等。发现:选择了以下基本要素全面分析塔塔尔语言模型的发展情况:单词,词素,词素(从统计上来说是简明的选择),词干和词缀链,音节和字母。因此,一些模型针对2克,3克,4克,平滑技术和语言的基本元素的所有组合构造而成。此外,进行了一项实验,显示了基于单词类别开发语言模型的可能性。结论:根据执行的实验结果,得出关于塔塔尔语言语法描述的质量,覆盖词典的程度以及每种构建模型所需的词汇量的结论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号