首页> 外国专利> SYSTEMS AND METHODS FOR IDENTIFYING PARALLEL DOCUMENTS AND SENTENCE FRAGMENTS IN MULTILINGUAL DOCUMENT COLLECTIONS

SYSTEMS AND METHODS FOR IDENTIFYING PARALLEL DOCUMENTS AND SENTENCE FRAGMENTS IN MULTILINGUAL DOCUMENT COLLECTIONS

机译:用于识别多语言文档集合中的并行文档和句子片段的系统和方法

摘要

Systems, computer programs, and methods for identifying parallel documents and/or fragments in a bilingual collection are provided. The method for identifying parallel sub-sentential fragments in a bilingual collection comprises translating a source document from a bilingual collection. The method further includes querying a target library associated with the bilingual collection using the translated source document, and identifying one or more target documents based on the query. Subsequently, a source sentence associated with the source document is aligned to one or more target sentences associated with the one or more target documents. Finally, the method includes determining whether a source fragment associated with the source sentence comprises a parallel translation of a target fragment associated with the one or more target sentences.
机译:提供了用于识别双语集合中的并行文档和/或片段的系统,计算机程序和方法。用于识别双语集合中的平行子句片段的方法包括翻译来自双语集合的源文档。该方法还包括使用翻译后的源文档查询与双语馆藏相关联的目标库,并基于该查询来识别一个或多个目标文档。随后,将与源文档相关联的源语句与与一个或多个目标文档相关联的一个或多个目标语句对齐。最终,该方法包括确定与源句子相关联的源片段是否包括与一个或多个目标句子相关联的目标片段的并行翻译。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号