首页> 外文会议>Workshop on Natural Language Processing for Indigenous Languages of the Americas >Expanding Universal Dependencies for Polysynthetic Languages: A Case of St. Lawrence Island Yupik
【24h】

Expanding Universal Dependencies for Polysynthetic Languages: A Case of St. Lawrence Island Yupik

机译:扩大Polysynthetic语言的普遍依赖性:一个St. Lawrence Island Yupik的情况

获取原文

摘要

This paper describes the development of the first Universal Dependencies (UD, Nivre et al., 2016, 2020) treebank for St. Lawrence Island Yupik, an endangered language spoken in the Bering Strait region. While the UD guidelines provided a general framework for our annotations, language-specific decisions were made necessary by the rich morphology of the polysynthetic language. Most notably, we annotated a corpus at the morpheme level as well as the word level. The morpheme level annotation was conducted using an existing morphological analyzer (Chen et al., 2020) and manual disambiguation. By comparing the two resulting annotation schemes, we argue that morpheme-level annotation is essential for polysynthetic languages like St. Lawrence Island Yupik. Word-level annotation results in degenerate trees for some Yupik sentences and often fails to capture syntactic relations that can be manifested at the morpheme level. Dependency parsing experiments provide further support for morpheme-level annotation. Implications for UD annotation of other polysynthetic languages are discussed.
机译:本文介绍了第一个普遍依赖性的发展(UD,Nivre等,2016,2020)TreeBank for St. Lawrence island Yupik,这是在Bering海峡地区口交的濒危语言。虽然UD指南为我们的注释提供了一般框架,但是通过多晶体语言的丰富形态所必需的语言特定决策。最值得注意的是,我们在语素水平和单词级别注释了一个语料库。使用现有的形态分析仪(Chen等,2020)和手工消歧,进行了语素水平注释。通过比较两种由此产生的注释方案,我们认为语素级注释对于圣劳伦斯岛Yupik等多种语言来说至关重要。单词级注释导致一些Yupik句子的堕落树,并且通常无法捕获可以在语素级别表现出来的句法关系。依赖解析实验提供了对语素级注释的进一步支持。讨论了对UD注释其他多合成语言的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号