首页> 外文OA文献 >A hybrid language model based on statistics and linguistic rules
【2h】

A hybrid language model based on statistics and linguistic rules

机译:基于统计和语言规则的混合语言模型

摘要

Language modeling is a current research topic in many domains including speech recognition, optical character recognition, handwriting recognition, machine translation and spelling correction. There are two main types of language models, the mathematical and the linguistic. The most widely used mathematical language model is the n-gram model inferred from statistics. This model has three problems: long distance restriction, recursive nature and partial language understanding. Language models based on linguistics present many difficulties when applied to large scale real texts. We present here a new hybrid language model that combines the advantages of the n-gram statistical language model with those of a linguistic language model which makes use of grammatical or semantic rules. Using suitable rules, this hybrid model can solve problems such as long distance restriction, recursive nature and partial language understanding. The new language model has been effective in experiments and has been incorporated in Chinese sentence input products for Windows and Macintosh OS.
机译:语言建模是许多领域的当前研究主题,包括语音识别,光学字符识别,手写识别,机器翻译和拼写校正。语言模型有两种主要类型,即数学模型和语言模型。最广泛使用的数学语言模型是从统计推断出的n-gram模型。该模型存在三个问题:长距离限制,递归性质和部分语言理解。基于语言学的语言模型在应用于大规模真实文本时会遇到许多困难。我们在这里提出了一种新的混合语言模型,该模型结合了n-gram统计语言模型的优势和使用语法或语义规则的语言模型的优势。使用适当的规则,此混合模型可以解决诸如长距离限制,递归性质和部分语言理解之类的问题。新的语言模型已经在实验中有效,并且已被并入Windows和Macintosh OS的中文句子输入产品中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号