Improved Smoothing for N-gram Language Models Based on Ordinary Counts

机译：基于普通计数的N-gram语言模型的改进平滑

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Kneser-Ney (1995) smoothing and its variants are generally recognized as having the best perplexity of any known method for estimating N-gram language models. Kneser-Ney smoothing, however, requires nonstandard N-gram counts for the lower-order models used to smooth the highest-order model. For some applications, this makes Kneser-Ney smoothing inappropriate or inconvenient. In this paper, we introduce a new smoothing method based on ordinary counts that outperforms all of the previous ordinary-count methods we have tested, with the new method eliminating most of the gap between Kneser-Ney and those methods.

机译：Kneser-Ney（1995）平滑及其变体通常被认为在估计N-gram语言模型的任何已知方法中都具有最佳的困惑。但是，对于用于平滑最高阶模型的低阶模型，Kneser-Ney平滑需要非标准的N-gram计数。对于某些应用程序，这会使Kneser-Ney平滑变得不合适或不方便。在本文中，我们介绍了一种基于普通计数的新平滑方法，该方法优于我们测试过的所有以前的普通计数方法，并且该新方法消除了Kneser-Ney与这些方法之间的大部分差距。

著录项

来源
《Joint conference of the annual meeting of the Association for Computational Linguistics;International joint conference on natural language processing of the Asian Federation of Natural Languages Processing;ACL 2009;IJCNLP 2009》|2009年|P.349-352|共4页
会议地点
作者
Robert C. Moore; Chris Quirk;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序语言、算法语言;
关键词

相似文献

外文文献
中文文献
专利

1. Smoothed n-gram based models for tweet language identification: A case study of the Brazilian and European Portuguese national varieties [J] . Castro Dayvid W., Souza Ellen, Vitorio Douglas, Applied Soft Computing . 2017,第期

机译：用于推文语言识别的平滑N-GRAM模型：巴西和欧洲葡萄牙民族品种的案例研究
2. FINITE STATE LANGUAGE MODELS SMOOTHED USING n-GRAMS [J] . DAVID LLORENS, JUAN MIGUEL VILAR, FRANCISCO CASACUBERTA International Journal of Pattern Recognition and Artificial Intelligence . 2002,第3期

机译：使用n-gram平滑的有限状态语言模型
3. SMOOTHING TECHNIQUES EVALUATION OF N-GRAM LANGUAGE MODEL FOR ARABIC OCR POST-PROCESSING [J] . AHMED FARIS RAAID AL-MASOUDI, HISHAM SALAM RAFID AL-OBEIDI Journal of Theoretical and Applied Information Technology . 2015,第3期

机译：阿拉伯OCR后处理的N-G语言模型平滑化技术评价
4. Improved Smoothing for N-gram Language Models Based on Ordinary Counts [C] . Joint conference of the annual meeting of the Association for Computational Linguistics . 2009

机译：基于普通计数改善了N-GRAM语言模型的平滑
5. Language-independent text learning with statistical n-gram language models. [D] . Peng, Fuchun. 2003

机译：统计n-gram语言模型的独立于语言的文本学习。
6. Numerical Discretization-Based Estimation Methods for Ordinary Differential Equation Models via Penalized Spline Smoothing with Applications in Biomedical Research [O] . Hulin Wu, Hongqi Xue, Arun Kumar -1

机译：普通微分方程模型的数值离散化估计方法通过生物医学研究中的应用与惩罚样条平滑平滑
7. Improved Smoothing for N-gram Language Models Based on Ordinary Counts [O] . Robert C. Moore, Chris Quirk 2010

机译：基于普通计数的N-gram语言模型的改进平滑
8. Investigation of Back-off Based Interpolation Between Recurrent Neural Network and N-gram Language Models (Author's Manuscript). [R] . Chen, X., Liu, X., Gales, M. J. F., 2016

机译：基于回退的递归神经网络与N-gram语言模型的插值研究（作者手稿）。

Improved Smoothing for N-gram Language Models Based on Ordinary Counts

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅