【24h】

Probabilistic Lexical Generalization for French Dependency Parsing

机译:法国依存关系解析的概率词法泛化

获取原文
获取原文并翻译 | 示例

摘要

This paper investigates the impact on French dependency parsing of lexical generalization methods beyond lemmatization and morphological analysis. A distributional thesaurus is created from a large text corpus and used for distributional clustering and WordNet automatic sense ranking. The standard approach for lexical generalization in parsing is to map a word to a single generalized class, either replacing the word with the class or adding a new feature for the class. We use a richer framework that allows for probabilistic generalization, with a word represented as a probability distribution over a space of generalized classes: lemmas, clusters, or synsets. Probabilistic lexical information is introduced into parser feature vectors by modifying the weights of lexical features. We obtain improvements in parsing accuracy with some lexical generalization configurations in experiments run on the French Treebank and two out-of-domain treebanks, with slightly better performance for the probabilistic lexical generalization approach compared to the standard single-mapping approach.
机译:本文研究了除词形化和形态分析之外的词法泛化方法对法国依存关系分析的影响。从大型文本语料库创建一个分布式同义词库,并将其用于分布式聚类和WordNet自动感知排名。解析中的词法泛化的标准方法是将单词映射到单个泛化类,或者用该类替换该单词或为该类添加新功能。我们使用了一个更丰富的框架,该框架允许概率泛化,一个单词表示为在广义类空间上的概率分布:词元,聚类或同义词集。通过修改词汇特征的权重,将概率词汇信息引入解析器特征向量。在法国树库和两个域外树库上进行的实验中,我们通过一些词法概括配置获得了更高的解析准确性,与标准单映射方法相比,概率词法概括方法的性能稍好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号