首页> 外文会议>Annual meeting of the Association for Computational Linguistics;ACL 2011 >Language-independent Compound Splitting with Morphological Operations
【24h】

Language-independent Compound Splitting with Morphological Operations

机译:与语言无关的化合物拆分与形态学运算

获取原文

摘要

Translating compounds is an important problem in machine translation. Since many compounds have not been observed during training, they pose a challenge for translation systems. Previous decompounding methods have often been restricted to a small set of languages as they cannot deal with more complex compound forming processes. We present a novel and unsupervised method to learn the compound parts and morphological operations needed to split compounds into their compound parts. The method uses a bilingual corpus to learn the morphological operations required to split a compound into its parts. Furthermore, monolingual corpora are used to learn and filter the set of compound part candidates. We evaluate our method within a machine translation task and show significant improvements for various languages to show the versatility of the approach.
机译:翻译化合物是机器翻译中的重要问题。由于在训练过程中未观察到许多化合物,因此它们对翻译系统构成了挑战。以前的解复合方法通常只能使用一小部分语言,因为它们无法处理更复杂的化合物形成过程。我们提出了一种新颖的,无监督的方法来学习化合物的组成部分以及将化合物拆分成它们的化合物部分所需的形态学操作。该方法使用双语语料库来学习将化合物拆分成各个部分所需的形态学操作。此外,使用单语语料库来学习和过滤一组候选复合词。我们在机器翻译任务中评估我们的方法,并显示了对各种语言的显着改进,以显示该方法的多功能性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号