首页> 外国专利> SYSTEM AND METHOD FOR PRUNING A SET OF SYMBOL-BASED SEQUENCES BY RELAXING AN INDEPENDENCE ASSUMPTION OF THE SEQUENCES

SYSTEM AND METHOD FOR PRUNING A SET OF SYMBOL-BASED SEQUENCES BY RELAXING AN INDEPENDENCE ASSUMPTION OF THE SEQUENCES

机译:通过放松序列的独立假设来修剪一组基于符号的序列的系统和方法

摘要

A pruning method includes representing a set of sequences in a data structure. Each sequence s includes a first symbol w and a context c of at least one symbol. Some of the sequences are associated with a conditional probability p(w|c), based on observations of cw in training data. For others, p(w|c) is computed as a function of the probability p(w|ĉ) of the respective symbol w in a back-off context ĉ, p(w|ĉ) being based on observations of sequence ĉw in the training data. A scoring function f (cw) value is computed for each sequence in the set, based on p(w|c) for the sequence and a probability distribution p(s) of each symbol in the sequence if it is removed from the set of sequences. Iteratively, one of the represented sequences is selected to be removed, based on the computed scoring function values, and the scoring function values of remaining sequences are updated.
机译:修剪方法包括在数据结构中表示一组序列。每个序列s包括第一符号w和至少一个符号的上下文c。基于训练数据中cw的观察,一些序列与条件概率p(w | c)相关联。另外,p(w | c)是根据退避上下文ĉ中各个符号w的概率p(w |ĉ)的函数而计算的,p(w | being)是基于对序列w的观察训练数据。根据序列的p(w | c)和序列中每个符号的概率分布p(s)(如果从集合中删除),为集合中的每个序列计算得分函数f(cw)值。序列。迭代地,基于所计算的得分函数值来选择要去除的表示序列之一,并更新其余序列的得分函数值。

著录项

  • 公开/公告号US2018075084A1

    专利类型

  • 公开/公告日2018-03-15

    原文格式PDF

  • 申请/专利权人 CONDUENT BUSINESS SERVICES LLC;

    申请/专利号US201615262383

  • 发明设计人 MATIAS HUNICKEN;MATTHIAS GELLÉ;

    申请日2016-09-12

  • 分类号G06F17/30;G06F17/18;

  • 国家 US

  • 入库时间 2022-08-21 13:04:24

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号