Improving cross-domain n-gram language modelling with skipgrams

机译：使用Skipgram改进跨域n-gram语言建模

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we improve over the hierarchical Pitman-Yor processes language model in a cross-domain setting by adding skip-grams as features. We find that adding skipgram features reduces the perplexity. This reduction is substantial when models are trained on a generic corpus and tested on domain-specific corpora. We also find that within-domain testing and cross-domain testing require different backoff strategies. We observe a 30-40% reduction in perplexity in a cross-domain language modelling task, and up to 6% reduction in a within-domain experiment, for both English and Flemish-Dutch.

机译：在本文中，我们通过添加跳跃图作为功能来改进跨域设置中的分层Pitman-Yor流程语言模型。我们发现添加skipgram功能可以减少困惑。当在通用语料库上训练模型并在特定领域的语料库上进行测试时，这种减少是相当大的。我们还发现域内测试和跨域测试需要不同的退避策略。对于英语和弗拉芒语-荷兰语，我们发现跨域语言建模任务的困惑度降低了30-40％，在域内实验中降低了6％。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2016年|137-142|共6页
会议地点
作者
Louis Onrust; Antal van den Bosch; Hugo Van hamme;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Efficient n-gram, Skipgram and Flexgram Modelling with Colibri Core [J] . Maarten van Gompel, Antal van den Bosch Journal of Open Research Software . 2016,第1期

机译：Colibri Core的高效n-gram，Skipgram和Flexgram建模
2. An empirical study of statistical language models: n-gram language models vs. neural network language models [J] . Freha Mezzoudj, Abdelkader Benyettou International Journal of Innovative Computing and Applications . 2018,第4期

机译：统计语言模型的实证研究：n-gram语言模型与神经网络语言模型
3. Converting Continuous-Space Language Models into N-gram Language Models with Efficient Bilingual Pruning for Statistical Machine Translation [J] . RUI WANG, MASAO UTIYAMA, ISAO GOTO, ACM transactions on Asian language information processing . 2016,第3期

机译：通过高效的双语修剪将连续空间语言模型转换为N-gram语言模型以进行统计机器翻译
4. Improving cross-domain n-gram language modelling with skipgrams [C] . Louis Onrust, Antal van den Bosch, Hugo Van hamme Annual meeting of the Association for Computational Linguistics . 2016

机译：用脚踏图改进跨域n-gram语言建模
5. Language-independent text learning with statistical n-gram language models. [D] . Peng, Fuchun. 2003

机译：统计n-gram语言模型的独立于语言的文本学习。
6. Modeling Actions of PubMed Users with N-Gram Language Models [O] . Jimmy Lin, W. John Wilbur -1

机译：N-Gram语言模型对PubMed用户的建模动作
7. Improving cross-domain n-gram language modelling with skipgrams [O] . Onrust L., Bosch A.P.J. van den, Hamme H. Van 2016

机译：使用Skipgram改进跨域n-gram语言建模

Improving cross-domain n-gram language modelling with skipgrams

摘要

著录项

相似文献

相关主题

期刊订阅