【24h】

Dependency Grammar Induction via Bitext Projection Constraints

机译:通过双文本投影约束的依赖语法归纳

获取原文

摘要

Broad-coverage annotated treebanks necessary to train parsers do not exist for many resource-poor languages. The wide availability of parallel text and accurate parsers in English has opened up the possibility of grammar induction through partial transfer across bitext. We consider generative and discriminative models for dependency grammar induction that use word-level alignments and a source language parser (English) to constrain the space of possible target trees. Unlike previous approaches, our framework does not require full projected parses, allowing partial, approximate transfer through linear expectation constraints on the space of distributions over trees. We consider several types of constraints that range from generic dependency conservation to language-specific annotation rules for auxiliary verb analysis. We evaluate our approach on Bulgarian and Spanish CoNLL shared task data and show that we consistently outperform unsupervised methods and can outperform supervised learning for limited training data.
机译:对于许多资源匮乏的语言来说,训练解析器所必需的具有大范围注释的树库并不存在。并行文本和准确的英语解析器的广泛使用为通过跨extext的部分传输带来了语法归纳的可能性。我们考虑依赖项语法归纳的生成模型和判别模型,这些模型使用单词级对齐和源语言解析器(英语)来约束可能的目标树的空间。与以前的方法不同,我们的框架不需要完整的投影解析,而可以通过对树分布空间的线性期望约束来进行部分近似的传递。我们考虑几种类型的约束,范围从通用依赖关系保留到辅助动词分析的特定于语言的注释规则。我们对保加利亚和西班牙CoNLL共享任务数据的方法进行了评估,结果表明,对于有限的培训数据,我们始终优于无监督方法,并且可以胜过有监督的学习。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号