首页> 外文期刊>Machine translation >Automatic induction of bilingual resources from aligned parallel corpora:application to shallow-transfer machine translation
【24h】

Automatic induction of bilingual resources from aligned parallel corpora:application to shallow-transfer machine translation

机译:从对齐的并行语料库中自动提取双语资源:在浅传输机器翻译中的应用

获取原文
获取原文并翻译 | 示例
       

摘要

The availability of machine-readable bilingual linguistic resources is crucial not only for rule-based machine translation but also for other applications such as cross-lingual information retrieval. However, the building of such resources (bilingual single-word and multi-word correspondences, translation rules) demands extensive manual work, and, as a consequence, bilingual resources are usually more difficult to find than "shallow" monolingual resources such as morphological dictionaries or part-of-speech taggers, especially when they involve a less-resourced language. This paper describes a methodology to build automatically both bilingual dictionaries and shallow-transfer rules by extracting knowledge from word-aligned parallel corpora processed with shallow monolingual resources (morphological analysers, and part-of-speech taggers). We present experiments for Brazilian Portuguese-Spanish and Brazilian Portuguese-English parallel texts. The results show that the proposed methodology can enable the rapid creation of valuable computational resources (bilingual dictionaries and shallow-transfer rules) for machine translation and other natural language processing tasks).
机译:机器可读的双语语言资源的可用性不仅对于基于规则的机器翻译至关重要,而且对于其他应用程序(如跨语言信息检索)也至关重要。但是,这种资源(双语单字和多字对应,翻译规则)的建立需要大量的人工工作,因此,与“浅”单语资源(例如形态词典)相比,双语资源通常更难找到或词性标记器,尤其是当它们涉及资源较少的语言时。本文介绍了一种方法,该方法可通过从使用浅单语资源(形态分析器和词性标记器)处理过的单词对齐的并行语料库中提取知识来自动构建双语词典和浅转移规则。我们提供巴西葡萄牙语-西班牙语和巴西葡萄牙语-英语平行文本的实验。结果表明,所提出的方法可以为机器翻译和其他自然语言处理任务快速创建有价值的计算资源(双语词典和浅层传输规则)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号