首页> 外文会议>Conference on computational natural language learning >Universal Joint Morph-Syntactic Processing: The Open University of Israel's Submission to The CoNLL 2017 Shared Task
【24h】

Universal Joint Morph-Syntactic Processing: The Open University of Israel's Submission to The CoNLL 2017 Shared Task

机译:通用联合句法处理:以色列开放大学提交的CoNLL 2017共享任务

获取原文

摘要

We present the Open University's submission (ID OpenU-NLP-Lab) to the CoNLL 2017 UD Shared Task on multilingual parsing from raw text to Universal Dependencies. The core of our system is a joint morphological disambiguator and syntactic parser which accepts morphologically analyzed surface tokens as input and returns morphologically disambiguated dependency trees as output. Our parser requires a lattice as input, so we generate morphological analyses of surface tokens using a data-driven morphological analyzer that derives its lexicon from the UD training corpora, and we rely on UDPipe for sentence segmentation and surface-level tokenization. We report our official macro-average LAS is 56.56. Although our model is not as performant as many others, it does not make use of neural networks, therefore we do not rely on word embeddings or any other data source other than the corpora themselves. In addition, we show the utility of a lexicon-backed morphological analyzer for the MRL Modern Hebrew. We use our results on Modern Hebrew to argue that the UD community should define a UD-compatible standard for access to lexical resources, which we argue is crucial for MRLs and low resource languages in particular.
机译:我们向CoNLL 2017 UD共享任务提交了开放大学的提交内容(ID OpenU-NLP-Lab),涉及从原始文本到通用依赖项的多语言解析。我们系统的核心是联合形态学消歧器和语法解析器,它接受形态分析的表面标记作为输入,并返回形态消歧的依赖树作为输出。我们的解析器需要一个格作为输入,因此我们使用数据驱动的词法分析器生成表面标记的词法分析,该词法分析器从UD训练语料库中提取其词典,并且我们依靠UDPipe进行句子分割和表面级别的词法化。我们报告我们的官方宏观平均LAS为56.56。尽管我们的模型不如其他模型具有更好的性能,但它没有利用神经网络,因此我们不依赖词嵌入或除语料库本身以外的任何其他数据源。此外,我们展示了用于MRL现代希伯来语的词典支持的形态分析器的实用程序。我们使用《现代希伯来语》的结果来论证,UD社区应该为访问词汇资源定义UD兼容标准,我们认为该标准对于MRL和特别是低资源语言至关重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号