【24h】

Multi-Task Word Alignment Triangulation for Low-Resource Languages

机译:低资源语言的多任务单词对齐三角剖分

获取原文

摘要

We present a multi-task learning approach that jointly trains three word alignment models over disjoint bitexts of three languages: source, target and pivot. Our approach builds upon model triangulation, following Wang et al., which approximates a source-target model by combining source-pivot and pivot-target models. We develop a MAP-EM algorithm that uses triangulation as a prior, and show how to extend it to a multi-task setting. On a low-resource Czech-English corpus, using French as the pivot, our multi-task learning approach more than doubles the gains in both F-and B scores compared to the interpolation approach of Wang et al. Further experiments reveal that the choice of pivot language does not significantly affect performance.
机译:我们提出了一种多任务学习方法,可以在三种语言(源,目标和支点)的不相交的bitexts上共同训练三个单词对齐模型。根据Wang等人的观点,我们的方法建立在模型三角剖分的基础上,该模型通过组合源-枢轴模型和枢轴-目标模型来近似源-目标模型。我们开发了使用三角测量作为先验的MAP-EM算法,并展示了如何将其扩展到多任务设置。在资源匮乏的捷克英语语料库中,以法语为中心,与Wang等人的插值方法相比,我们的多任务学习方法在F和B分数方面的收益增加了一倍以上。进一步的实验表明,枢纽语言的选择不会显着影响性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号