首页> 外国专利> PROBABILITY MODEL OF UNION BASED ON PHRASES FOR STATISTICAL AUTOMATIC TRANSLATION.

PROBABILITY MODEL OF UNION BASED ON PHRASES FOR STATISTICAL AUTOMATIC TRANSLATION.

机译:基于短语的统计自动翻译概率模型。

摘要

Procedure implemented by computer to generate a joint probability model based on phrases from a parallel body comprising a plurality of sentences in the source language and a corresponding plurality of sentences in the target language; the procedure comprising: a) defining from the parallel body high frequency n-grams ( vec (ei) in E, and ( vec (fi) in F, where E and F comprise phrases in the source and target language , respectively; b) obtain an initial joint probability distribution t based on the sentence, by: i) taking, for each pair of sentences (E, F) in the body, three Cartesian products of n-grams ( vec (ei ) in E, and ( vec (fi) in F; ii) determine, for each pair of n-grams (ei, fi) in the Cartesian product, a count t given by the expression: ** (See formula) * * where l and m are the lengths of the phrases E and F, respectively, a and b are the lengths of the n-grams ( vec (ei) and ( vec (fi), and S is the second class Stirling number; iii ) add the accounts t and normalize, and c) perform the Maximum Expectation training for a plurality of iterations to generate a joint probability distribution t.
机译:由计算机实现以基于来自平行体的短语来生成联合概率模型的过程,所述短语包括源语言中的多个句子和目标语言中的相应多个句子;该过程包括:a)从平行体中定义高频n-gram(E中的 vec(ei)和F中的( vec(fi)),其中E和F分别包括源语言和目标语言中的短语; b)通过以下方式获得基于句子的初始联合概率分布t:i)对身体中的每对句子(E,F)取三个n-gram的笛卡尔积( vec(ei),和(F中的 vec(fi); ii)对于笛卡尔积中的每对n-gram(ei,fi),确定由以下表达式给出的计数t:**(请参阅公式)* * m是短语E和F的长度,a和b是n元语法的长度( vec(ei)和( vec(fi),S是第二类斯特林数; iii)加帐户t并进行归一化,以及c)对多个迭代执行“最大期望”训练以生成联合概率分布t。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号