首页> 外国专利> SYSTEMS AND METHODS FOR IDENTIFYING PARALLEL DOCUMENTS AND SENTENCE FRAGMENTS IN MULTILINGUAL DOCUMENT COLLECTIONS

SYSTEMS AND METHODS FOR IDENTIFYING PARALLEL DOCUMENTS AND SENTENCE FRAGMENTS IN MULTILINGUAL DOCUMENT COLLECTIONS

机译：用于识别多语言文档集合中的并行文档和句子片段的系统和方法

页面导航

摘要
著录项
相似文献

摘要

Systems, computer programs, and methods for identifying parallel documents and/or fragments in a bilingual collection are provided. The method for identifying parallel sub-sentential fragments in a bilingual collection comprises translating a source document from a bilingual collection. The method further includes querying a target library associated with the bilingual collection using the translated source document, and identifying one or more target documents based on the query. Subsequently, a source sentence associated with the source document is aligned to one or more target sentences associated with the one or more target documents. Finally, the method includes determining whether a source fragment associated with the source sentence comprises a parallel translation of a target fragment associated with the one or more target sentences.

机译：提供了用于识别双语集合中的并行文档和/或片段的系统，计算机程序和方法。用于识别双语集合中的平行子句片段的方法包括翻译来自双语集合的源文档。该方法还包括使用翻译后的源文档查询与双语馆藏相关联的目标库，并基于该查询来识别一个或多个目标文档。随后，将与源文档相关联的源语句与与一个或多个目标文档相关联的一个或多个目标语句对齐。最终，该方法包括确定与源句子相关联的源片段是否包括与一个或多个目标句子相关联的目标片段的并行翻译。

著录项

公开/公告号WO2007126447A3

专利类型
公开/公告日2008-04-17

原文格式PDF
申请/专利权人 UNIVERSITY OF SOUTHERN CALIFORNIA;MARCU DANIEL;MUNTEANU DRAGOS STEFAN;
展开▼

申请/专利号WO2007US00853
发明设计人 MARCU DANIEL;MUNTEANU DRAGOS STEFAN;
展开▼

申请日2007-01-12
分类号G06F7;
国家 WO
入库时间 2022-08-21 20:02:00

相似文献

专利
外文文献
中文文献