【24h】

Integrating Probability into LR Parsing

机译:将概率集成到LR解析中

获取原文

摘要

In this paper, we describe the methods of acquiring stochastic knowledge from corpus and integrating both of the lexical and syntactic probabilities into a LR parser in order to improve its disambiguatign ability. Based on the Hidden Markov Model (HMM) tagging system, we introduce the lexical statistic information acquistion. These information can be learned from the corpus no matter the corpus is tagged or not. On the syntactic(grammar) level, we proposed two different concepts; derivation porbability and reduction probability of Contect Free Grammar(CFG) rule.Because the original Inside-OUtside algorithm can only estimate the derivation probabilities of rules, we designed and implemented an automatic method to estimate the reduction porbabilities of rules from corpus. We also applied both the lexical porbability and rule probability of grammar to the CFG parser of a English to Chinese MT system. We set up a new kind of scoring system based on the statistic knowledge as the criteria of disambiguating the syntactic structures. Experiment shows that the accuracy of the parsing has been improved.
机译:在本文中,我们描述了从语料库获取随机知识的方法,并将所有词汇和句法概率集成到LR解析器中,以提高其歧义能力。基于隐藏的马尔可夫模型(HMM)标记系统,我们介绍了词汇统计信息获取。无论标记是否标记,都可以从语料库中学到这些信息。在句法(语法)水平上,我们提出了两个不同的概念;偶数语法(CFG)规则的衍生资料和降低概率。因为原来的内外算法只能估计规则的推导概率,我们设计并实施了一种自动方法来估算来自语料库的规则减少的缩减合理性。我们还将语法的词汇放血性和规则概率应用于英语MT系统的英语CFG解析器。根据统计知识为消除句法结构的标准,建立了一种基于统计知识的新分数系统。实验表明解析的准确性得到了改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号