...
首页> 外文期刊>Latin American Applied Research >A new estimator based on maximum entropy
【24h】

A new estimator based on maximum entropy

机译:基于最大熵的新估计器

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper,we propose a new formulation of the classical Good-Turing estimator for n-gram language models.The new approach is based on defining a dynamic model for language production.Instead of assuming a fixed probability distribution of occurrence of an n-gram on the whole text,we propose a maximum entropy approximation of a time varying distribution.This approximation led us to a new distribution,which in turn is used to calculate expectations of the Good-Turing estimator.This defines a new estimator that we call Maximum Entropy Good-Turing estimator.In contrast to the classical Good-Turing estimator,the new formulation needs neither expectations approximations nor windowing or other smoothing techniques.It also contains the well known discounting estimators as special cases.Performance is evaluated both in terms of perplexity and word error rate in an N-best re-scoring task.Also comparison to other classical estimators is performed.In all cases our approach performs significantly better than classical estimators.
机译:在本文中,我们为n-gram语言模型提出了经典的Good-Turing估计器的新公式。该新方法基于定义语言生成的动态模型,而不是假设n-gram语言发生的固定概率分布。在全文中,我们提出一个时变分布的最大熵近似值。该近似值导致我们得到一个新的分布,该分布又用于计算Good-Turing估计量的期望。这定义了一个新的估计量,我们称之为最大熵Good-Turing估计器。与经典的Good-Turing估计器相比,新公式既不需要期望逼近,也不需要开窗或其他平滑技术,还包含众所周知的折现估计器(作为特殊情况)。 N最佳重新评分任务中的困惑性和字错误率。还与其他经典估计量进行了比较。比经典估计器好得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号