首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Learning Morphosyntactic Analyzers from the Bible via Iterative Annotation Projection across 26 Languages
【24h】

Learning Morphosyntactic Analyzers from the Bible via Iterative Annotation Projection across 26 Languages

机译:通过26种语言的迭代注释投影从圣经中学习形态音位分析器

获取原文

摘要

A large percentage of computational tools are concentrated in a very small subset of the planet's languages. Compounding the issue, many languages lack the high-quality linguistic annotation necessary for the construction of such tools with current machine learning methods. In this paper, we address both issues simultaneously: leveraging the high accuracy of English taggers and parsers, we project morphological information onto translations of the Bible in 26 varied test languages. Using an iterative discovery, constraint, and training process, we build inflectional lexica in the target languages. Through a combination of iteration, ensembling, and reranking, we see double-digit relative error reductions in lemmatization and morphological analysis over a strong initial system.
机译:很大一部分计算工具都集中在地球语言的一小部分中。使问题复杂化的是,许多语言都缺乏使用当前的机器学习方法构造此类工具所必需的高质量语言注释。在本文中,我们同时解决了两个问题:利用英语标记器和解析器的高精度,我们将形态学信息投影到以26种不同测试语言编写的圣经译本中。通过反复的发现,约束和训练过程,我们以目标语言构建了变形词典。通过迭代,组合和重新排序的组合,我们可以看到在强大的初始系统上,词原化和形态分析的两位数相对误差减少了。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号