首页> 外国专利> Hidden Markov Models (HMM) approximating finite transducers and their use for text tagging

Hidden Markov Models (HMM) approximating finite transducers and their use for text tagging

机译:隐马尔可夫模型(HMM)逼近有限换能器及其在文本标记中的应用

摘要

A sequential transducer, derived from a Hidden Markov Model, that closely approximates the behavior of the stochastic model. The invention provides (a) a method (called n-type approximation) of deriving a simple finite-state transducer which is applicable in all cases, from HMM probability matrices, (b) a method (called s-type approximation) for building a precise HMM transducer for selected cases which are taken from a training corpus, (c) a method for completing the precise (s-type) transducer with sequences from the simple (n-type) transducer, which makes the precise transducer applicable in all cases, and (d) a method (called b-type approximation) for building an HMM transducer with variable precision which is applicable in all cases. This transformation is especially adavantageous for part-of-speech tagging because the resulting transducer can be composed with other transducers that encode correction rules for the most frequent tagging errors. The speed of tagging is also improved. The described methods have been implemented and successfully tested on six languages.
机译:从隐马尔可夫模型派生的顺序换能器,非常接近于随机模型的行为。本发明提供了(a)从HMM概率矩阵推导适用于所有情况的简单有限状态换能器的方法(称为n型逼近),(b)用于建立a的方法(称为s型逼近)。精确的HMM传感器,用于从训练语料库中选取的特定情况;(c)一种方法,用于使用简单(n型)传感器的序列来完成精确(s型)传感器,这使得该精确传感器适用于所有情况(d)一种适用于所有情况的,具有可变精度的HMM传感器的制造方法(称为b型近似)。此转换对于词性标记特别有利,因为所得换能器可以与其他换能器组合在一起,这些换能器对最常见的标记错误编码校正规则。标记速度也得到提高。所描述的方法已在六种语言上实现并成功测试。

著录项

  • 公开/公告号DE69802402T2

    专利类型

  • 公开/公告日2002-06-06

    原文格式PDF

  • 申请/专利权人 XEROX CORP. ROCHESTER;

    申请/专利号DE1998602402T

  • 发明设计人 KEMPE ANDRE;

    申请日1998-07-06

  • 分类号G06F17/27;

  • 国家 DE

  • 入库时间 2022-08-22 00:24:52

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号