首页> 外文会议>Joint workshop on linguistic annotation, multiword expressions and constructions >All Roads Lead to UD: Converting Stanford and Penn Parses to English Universal Dependencies with Multilayer Annotations
【24h】

All Roads Lead to UD: Converting Stanford and Penn Parses to English Universal Dependencies with Multilayer Annotations

机译:条条大路通向UD:使用多层注释将Stanford和Penn解析转换为英语通用依赖项

获取原文
获取原文并翻译 | 示例

摘要

We describe and evaluate different approaches to the conversion of gold standard corpus data from Stanford Typed Dependencies (SD) and Penn-style constituent trees to the latest English Universal Dependencies representation (UD 2.2). Our results indicate that pure SD to UD conversion is highly accurate across multiple genres, resulting in around 1.5% errors, but can be improved further to fewer than 0.5% errors given access to annotations beyond the pure syntax tree, such as entity types and coreference resolution, which are necessary for correct generation of several UD relations. We show that constituent-based conversion using CoreNLP (with automatic NER) performs substantially worse in all genres, including when using gold constituent trees, primarily due to underspecification of phrasal grammatical functions.
机译:我们描述和评估将斯坦福类型依赖关系(SD)和宾州式组成树的金标准语料库数据转换为最新的英语通用依赖关系表示形式(UD 2.2)的不同方法。我们的结果表明,纯SD到UD的跨多种类型转换都非常准确,导致大约1.5%的错误,但如果可以访问纯语法树之外的注释(例如实体类型和共指),则可以将错误进一步改进为小于0.5%的错误。分辨率,这对于正确生成多个UD关系是必需的。我们显示,使用CoreNLP(带有自动NER)的基于成分的转换在所有类型中的表现都较差,包括使用黄金成分树时,主要是由于短语语法功能的规格不足。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号