首页> 外文会议>Joint workshop on linguistic annotation, multiword expressions and constructions >All Roads Lead to UD: Converting Stanford and Penn Parses to English Universal Dependencies with Multilayer Annotations
【24h】

All Roads Lead to UD: Converting Stanford and Penn Parses to English Universal Dependencies with Multilayer Annotations

机译:所有道路都通往UD:将斯坦福和宾夕法尼亚州对英语通用依赖性转换为具有多层注释

获取原文

摘要

We describe and evaluate different approaches to the conversion of gold standard corpus data from Stanford Typed Dependencies (SD) and Penn-style constituent trees to the latest English Universal Dependencies representation (UD 2.2). Our results indicate that pure SD to UD conversion is highly accurate across multiple genres, resulting in around 1.5% errors, but can be improved further to fewer than 0.5% errors given access to annotations beyond the pure syntax tree, such as entity types and coreference resolution, which are necessary for correct generation of several UD relations. We show that constituent-based conversion using CoreNLP (with automatic NER) performs substantially worse in all genres, including when using gold constituent trees, primarily due to underspecification of phrasal grammatical functions.
机译:我们描述并评估了从斯坦福斯坦福类型依赖性(SD)和Penn-Sique Constinuent树的将Gold标准语料库数据转换为最新的英语通用依赖性表示(UD 2.2)。我们的结果表明,纯SD转换在多种类型中高度准确,导致误差约为1.5%,但可以进一步提高到纯粹句法树超出纯语法树的注释的0.5%错误,例如实体类型和COREREFED分辨率,这对于正确产生几个UD关系是必要的。我们表明,使用Corenlp(具有自动网)的基于组成的转换在所有类型中表现得显着差,包括使用金构成树木时,主要是由于短期性语法函数的缺点。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号