首页> 外国专利> Constructing a translation lexicon from comparable, non-parallel corpora

Constructing a translation lexicon from comparable, non-parallel corpora

机译：从可比的非平行语料库构建翻译词典

页面导航

摘要
著录项
相似文献

摘要

A machine translation system may use non-parallel monolingual corpora to generate a translation lexicon. The system may identify identically spelled words in the two corpora, and use them as a seed lexicon. The system may use various clues, e.g., context and frequency, to identify and score other possible translation pairs, using the seed lexicon as a basis. An alternative system may use a small bilingual lexicon in addition to non-parallel corpora to learn translations of unknown words and to generate a parallel corpus.

机译：机器翻译系统可以使用非并行单语语料库来生成翻译词典。该系统可以识别两个语料库中拼写相同的单词，并将它们用作种子词典。系统可以使用种子词典作为基础，使用各种线索，例如上下文和频率，来识别和评分其他可能的翻译对。除了非并行语料库之外，替代系统还可以使用小型双语词典来学习未知单词的翻译并生成并行语料库。

著录项

公开/公告号AU2003269808A8

专利类型
公开/公告日2004-01-06

原文格式PDF
申请/专利权人 UNIVERSITY OF SOUTHERN CALIFORNIA;
展开▼

申请/专利号AU20030269808
发明设计人 PHILIPP KOEHN;DRAGOS STEFAN MUNTEANU;DANIEL MARCU;KEVIN KNIGHT;
展开▼

申请日2003-03-26
分类号G06F17/27;G06F17/28;
国家 AU
入库时间 2022-08-21 23:02:30

相似文献

专利
外文文献
中文文献