Similar N-gram Language Model

机译：相似的N元语法模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes an extension of the n-gram language model: the similar n-gram language model. The estimation of the probability P(s) of a string s by the classical model of order n is computed using statistics of occurrences of the last n words of the string in the corpus, whereas the proposed model further uses all the strings s' for which the Levenshtein distance to s is smaller than a given threshold. The similarity between s and each string s' is estimated using co-occurrence statistics. The new P(s) is approximated by smoothing all the similar n-gram probabilities with a regression technique. A slight but statistically significant decrease in the word error rate is obtained on a state-of-the-art automatic speech recognition system when the similar n-gram language model is interpolated linearly with the n-gram model.

机译：本文介绍了n-gram语言模型的扩展：类似的n-gram语言模型。通过使用语料库中字符串的最后n个单词的出现统计来计算阶次n的经典模型对字符串s的概率P（s）的估计，而所提出的模型进一步将所有字符串s'用于Levenshtein到s的距离小于给定的阈值。 s与每个字符串s'之间的相似性是使用共现统计来估算的。通过使用回归技术对所有相似的n-gram概率进行平滑处理，可以近似得出新的P（s）。当将类似的n-gram语言模型与n-gram模型线性插值时，在最新的自动语音识别系统上，单词错误率会略有下降，但在统计上会显着降低。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2010》|2011年|p.1824-1827|共4页
会议地点
作者
Christian Gillot; Christophe Cerisara; David Langlois; Jean-Paul Haton;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
language modeling; ngram; similarity; string edit; levenshtein distance;

机译：语言建模; ngram;相似;字符串编辑;莱文施泰因距离;

相似文献

外文文献
中文文献
专利

1. An empirical study of statistical language models: n-gram language models vs. neural network language models [J] . Freha Mezzoudj, Abdelkader Benyettou International Journal of Innovative Computing and Applications . 2018,第4期

机译：统计语言模型的实证研究：n-gram语言模型与神经网络语言模型
2. Converting Continuous-Space Language Models into N-gram Language Models with Efficient Bilingual Pruning for Statistical Machine Translation [J] . RUI WANG, MASAO UTIYAMA, ISAO GOTO, ACM transactions on Asian language information processing . 2016,第3期

机译：通过高效的双语修剪将连续空间语言模型转换为N-gram语言模型以进行统计机器翻译
3. NgramSPD: Exploring optimal n-gram model for sentiment polarity detection in different languages [J] . Graovac Jelena, Mladenovic Miljana, Tanasijevic Ivana Intelligent data analysis . 2019,第2期

机译：NgramSPD：探索用于不同语言的情感极性检测的最佳n-gram模型
4. Show Some Love to Your n-grams: A Bit of Progress and Stronger n-gram Language Modeling Baselines [C] . Ehsan Shareghi, Daniela Gerz, Ivan Vulic, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2019

机译：向您的n-gram显示一些爱：一些进步和更强大的n-gram语言建模基准
5. Language-independent text learning with statistical n-gram language models. [D] . Peng, Fuchun. 2003

机译：统计n-gram语言模型的独立于语言的文本学习。
6. Modeling Actions of PubMed Users with N-Gram Language Models [O] . Jimmy Lin, W. John Wilbur -1

机译：N-Gram语言模型对PubMed用户的建模动作
7. WAYS TO IMPROVE N-GRAM LANGUAGE MODELS FOR OCR AND SPEECH RECOGNITION OF SLAVIC LANGUAGES [O] . Volume Issue, V. Taranukha 2015

机译：提高N-GRam语言模型的方法，用于对sLaVIC语言进行OCR和语音识别
8. Investigation of Back-off Based Interpolation Between Recurrent Neural Network and N-gram Language Models (Author's Manuscript). [R] . Chen, X., Liu, X., Gales, M. J. F., 2016

机译：基于回退的递归神经网络与N-gram语言模型的插值研究（作者手稿）。

Similar N-gram Language Model

摘要

著录项

相似文献

相关主题

期刊订阅