首页> 外国专利> Discriminative training of models for sequence classification

Discriminative training of models for sequence classification

机译:序列分类模型的判别训练

摘要

Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.
机译:序列的分类(例如自然语言句子的翻译)是使用独立性假设进行的。独立性假设是这样的假设:将源句子单词正确翻译成特定目标句子单词的概率与句子中其他单词的翻译无关。尽管这种假设是不正确的,但是仍然可以实现较高水平的单词翻译准确性。具体而言,判别训练用于基于训练句子中相应源词的一组特征为每个目标词汇词开发模型,这些特征中的至少一个与源词的上下文有关。每个模型都包括对应目标词汇词的权重向量。包含向量的权重与各个特征相关;每个权重是对源词存在该特征的程度的度量,可以使所讨论的目标词更可能是正确的。

著录项

  • 公开/公告号EP1939758A2

    专利类型

  • 公开/公告日2008-07-02

    原文格式PDF

  • 申请/专利权人 AT&T CORP.;

    申请/专利号EP20070122900

  • 申请日2007-12-11

  • 分类号G06F17/28;

  • 国家 EP

  • 入库时间 2022-08-21 19:55:41

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号