首页> 外国专利> Constructing a translation lexicon from comparable, non-parallel corpora

Constructing a translation lexicon from comparable, non-parallel corpora

机译:从可比的非平行语料库构建翻译词典

摘要

A machine translation system may use non-parallel monolingual corpora to generate a translation lexicon. The system may identify identically spelled words in the two corpora, and use them as a seed lexicon. The system may use various clues, e.g., context and frequency, to identify and score other possible translation pairs, using the seed lexicon as a basis. An alternative system may use a small bilingual lexicon in addition to non-parallel corpora to learn translations of unknown words and to generate a parallel corpus.
机译:机器翻译系统可以使用非并行单语语料库来生成翻译词典。该系统可以识别两个语料库中拼写相同的单词,并将它们用作种子词典。系统可以使用种子词典作为基础,使用各种线索,例如上下文和频率,来识别和评分其他可能的翻译对。除了非并行语料库之外,替代系统还可以使用小型双语词典来学习未知单词的翻译并生成并行语料库。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号