首页> 外文OA文献 >A Parallel Training Algorithm for Hierarchical Pitman-Yor Process Language Models

【2h】

A Parallel Training Algorithm for Hierarchical Pitman-Yor Process Language Models

机译：分层Pitman-Yor过程语言模型的并行训练算法

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Hierarchical Pitman Yor Process Language Model (HPYLM) is a Bayesian language model based on a non-parametric prior, the Pitman-Yor Process. It has been demonstrated, both theoretically and practically, that the HPYLM can provide better smoothing for language modeling, compared with state-of-the-art approaches such as interpolated Kneser-Ney and modified Kneser-Ney smoothing. However, estimation of Bayesian language models is expensive in terms of both computation time and memory; the inference is approximate and requires a number of iterations to converge. In this paper, we present a parallel training algorithm for the HPYLM, which enables the approach to be applied in the context of automatic speech recognition, using large training corpora with large vocabularies. We demonstrate the effectiveness of the proposed algorithm by estimating language models from corpora for meeting transcription containing over 200 million words, and observe significant reductions in perplexity and word error rate.

机译：分层Pitman Yor过程语言模型（HPYLM）是基于非参数先验Pitman-Yor过程的贝叶斯语言模型。从理论上和实践上都证明，与诸如插值Kneser-Ney和改进的Kneser-Ney平滑之类的最新方法相比，HPYLM可以为语言建模提供更好的平滑。但是，就计算时间和存储而言，贝叶斯语言模型的估计是昂贵的。推断是近似的，需要多次迭代才能收敛。在本文中，我们提出了一种用于HPYLM的并行训练算法，该算法使该方法可以在具有大量词汇的大型训练语料库的情况下应用于自动语音识别。我们通过估计语料库中满足超过2亿个单词的转录的语言模型来证明所提出算法的有效性，并观察到困惑和单词错误率的显着降低。

著录项

作者
Huang Songfang; Renals Steve;
展开▼
作者单位

展开▼
年度 2009
总页数
原文格式 PDF
正文语种 {"code":"en","name":"English","id":9}
中图分类

相似文献

外文文献
中文文献
专利

1. Nonparametric Bayesian topic modelling with the hierarchical Pitman-Yor processes [J] . Lim Kar Wai, Buntine Wray, Chen Changyou, Acoustic bulletin . 2016 ,第nova期

机译：具有分层Pitman-Yor过程的非参数贝叶斯主题建模
2. Can hierarchical models display parallel cortical dynamics? A non-hierarchical alternative of brain language theory [J] . Language, cognition and neuroscience . 2016 ,第4期

机译：可以层次结构显示并行皮质动态？脑语言理论的非分层替代方案
3. PARALLEL HIERARCHICAL CLUSTERING ALGORITHMS ON PROCESSOR ARRAYS WITH A RECONFIGURABLE BUS SYSTEM [J] . Tsai HR., Lee SS., Tsai SS., Pattern Recognition: The Journal of the Pattern Recognition Society . 1997 ,第5期

机译：具有可重构总线系统的处理器阵列上的并行层次聚类算法
4. A Parallel Training Algorithmfor Hierarchical Pitman-Yor Process Language Models [C] . Songfang Huang, Steve Renals International Speech Communication Association . 2009

机译：分层PITMAN-YOR过程语言模型的并行训练算法
5. Design of a smart non-volatile memory controller: Architecture modeling, systems analysis, parallel I/O processing and scheduling algorithms. [D] . Jung, Myoungsoo. 2013

机译：智能非易失性存储器控制器的设计：体系结构建模，系统分析，并行I / O处理和调度算法。
6. Learning Additional Languages as Hierarchical ProbabilisticInference: Insights From First Language Processing [O] . Bozena Pajak, Alex B. Fine, Dave F. Kleinschmidt, -1

机译：学习其他语言作为分层概率推论：来自母语处理的见解
7. A Hierarchical Bayesian Language Model based on Pitman-Yor Processes [O] . 2008

机译：基于Pitman-Yor过程的多层贝叶斯语言模型

A Parallel Training Algorithm for Hierarchical Pitman-Yor Process Language Models

摘要

著录项

相似文献

相关主题

期刊订阅