首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Language-independent Compound Splitting with Morphological Operations
【24h】

Language-independent Compound Splitting with Morphological Operations

机译:独立于语言的复合分裂与形态操作

获取原文
获取外文期刊封面目录资料

摘要

Translating compounds is an important problem in machine translation. Since many compounds have not been observed during training, they pose a challenge for translation systems. Previous decompounding methods have often been restricted to a small set of languages as they cannot deal with more complex compound forming processes. We present a novel and unsupervised method to learn the compound parts and morphological operations needed to split compounds into their compound parts. The method uses a bilingual corpus to learn the morphological operations required to split a compound into its parts. Furthermore, monolingual corpora are used to learn and filter the set of compound part candidates. We evaluate our method within a machine translation task and show significant improvements for various languages to show the versatility of the approach.
机译:翻译化合物是机器翻译中的一个重要问题。由于在训练期间没有观察到许多化合物,因此对翻译系统构成挑战。之前的分解方法通常被限制为一小组语言,因为它们无法处理更复杂的化合物形成过程。我们提出了一种新颖和无人监督的方法,以了解将化合物分成复合零件所需的复合零件和形态学作用。该方法使用双语语料库来学习将化合物分成其部件所需的形态学操作。此外,单梅林语料库用于学习和过滤这些复合部分候选者。我们在机器翻译任务中评估我们的方法,并显示各种语言的显着改进,以表明该方法的多功能性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号