首页> 外文期刊>Computer speech and language >Language modeling with probabilistic left corner parsing
【24h】

Language modeling with probabilistic left corner parsing

机译:带有概率左角解析的语言建模

获取原文
获取原文并翻译 | 示例

摘要

We present a novel language model, suitable for large-vocabulary continuous speech recognition, based on parsing with a probabilistic left corner grammar (PLCG). The PLCG probabilities are conditioned on local and non-local features of the partial parse tree, and some of these features are lexical. They are not derived from another stochastic grammar, but directly induced from a treebank, a corpus of text sentences, annotated with parse trees. A context-enriched constituent represents all partial parse trees that are equivalent with respect to the probability of the next parse move. For computational efficiency the parsing problem is represented as a traversal through a compact stochastic network of constituents connected by PLCG moves. The efficiency of the algorithm is due to the fact that the network consists of recursively nested, shared subnetworks. The PLCG-based language model results from accumulating the probabilities of all (partial) paths through this network. Next word probabilities can be computed synchronously with the probabilistic left corner parsing algorithm in one single pass from left to right. They are guaranteed to be normalized, even when pruning less likely paths. Finally, it is shown experimentally that the PLCG-based language model is a competitive alternative to other syntax-based language models, both in efficiency and accuracy.
机译:我们基于概率左角语法(PLCG)的解析,提出了一种适用于大词汇量连续语音识别的新颖语言模型。 PLCG概率取决于部分分析树的局部和非局部特征,其中一些特征是词法的。它们不是从另一种随机语法衍生而来的,而是直接从树库中提取出来的。上下文丰富的组成部分表示所有与下一次分析移动的概率相等的部分分析树。为了提高计算效率,解析问题表示为遍历通过PLCG移动连接的组成部分的紧凑型随机网络的遍历。该算法的效率归因于以下事实:网络由递归嵌套的共享子网组成。基于PLCG的语言模型是通过累加通过该网络的所有(部分)路径的概率得出的。下一个单词的概率可以与概率左角解析算法在从左到右的一次传递中同步计算。即使修剪不太可能的路径,也可以保证将它们标准化。最后,实验证明,基于PLCG的语言模型在效率和准确性方面都可以替代其他基于语法的语言模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号