首页> 外文期刊>Computer speech and language >Estimation of stochastic context-free grammars and their use as language models
【24h】

Estimation of stochastic context-free grammars and their use as language models

机译:随机上下文无关文法的估计及其作为语言模型的用途

获取原文
获取原文并翻译 | 示例

摘要

This paper is devoted to the estimation of stochastic context-free grammars (SCFGs) and their use as language models. Classical estimation algorithms, together with new ones that consider a certain subset of derivations in the estimation process, are presented in a unified framework. This set of derivations is chosen according to both structural and statistical criteria. The estimated SCFGs have been used in a new hybrid language model to combine both a word-based n-gram, which is used to capture the local relations between words, and a category-based SCFG together with a word distribution into categories, which is defined to represent the long-term relations between these categories. We describe methods for learning these stochastic models for complex tasks, and we present an algorithm for computing the word transition probability using this hybrid language model. Finally, experiments on the UPenn Treebank corpus show significant improvements in the test set perplexity with regard to the classical word trigram models.
机译:本文致力于随机上下文无关文法(SCFG)的估计及其作为语言模型的使用。在统一的框架中介绍了经典的估算算法,以及在估算过程中考虑了某些派生子集的新算法。根据结构和统计标准选择这组推导。估计的SCFG已用于新的混合语言模型中,以结合用于捕获单词之间局部关系的基于单词的n-gram和基于类别的SCFG,以及将单词分布分为多个类别,定义为代表这些类别之间的长期关系。我们描述了学习用于复杂任务的这些随机模型的方法,并且我们提出了一种使用这种混合语言模型来计算单词转移概率的算法。最后,在UPenn树库语料库上进行的实验表明,相对于经典单词Trigram模型,测试集的困惑度得到了显着改善。

著录项

  • 来源
    《Computer speech and language》 |2005年第3期|p. 249-274|共26页
  • 作者

    J. M. Benedi; J. A. Sanchez;

  • 作者单位

    Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Camino de Vera s, 46022 Valencia, Spain;

    Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Camino de Vera s, 46022 Valencia, Spain;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 计算技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号