首页> 外文会议>Modeling and Using Context >A New Method Based on Context for Combining Statistical Language Models

【24h】

A New Method Based on Context for Combining Statistical Language Models

机译：基于上下文的统计语言模型组合新方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we propose a new method to extract from a corpus the histories for which a given language model is better than another one. The decision is based on a measure stemmed from perplexity. This measure allows, for a given history, to compare two language models, and then to choose the best one for this history. Using this principle, and with a 20K vocabulary words, we combined two language models: a bigram and a distant bigram. The contribution of a distant bigram is significant and outperforms a bigram model by 7.5%. Moreover, the performance in Shannon game are improved. We show through this article that we proposed a cheaper framework in comparison to the maximum entropy principle, for combining language models. In addition, the selected histories for which a model is better than another one, have been collected and studied. Almost, all of them are beginnings of very frequently used French phrases. Finally, by using this principle, we achieve a better trigram model in terms of parameters and perplexity. This model is a combination of a bigram and a trigram based on a selected history.

机译：在本文中，我们提出了一种从语料库中提取给定语言模型优于另一种语言模型的历史的新方法。该决定是基于困惑所产生的。对于给定的历史记录，此度量可以比较两种语言模型，然后为该历史记录选择最佳的语言模型。使用此原理，并使用20K的词汇量，我们组合了两种语言模型：双字和远距双字。远处的二元模型的贡献显着，并且比二元模型高出7.5％。此外，香农游戏中的性能得到了改善。通过本文，我们证明了与最大熵原理相比，我们提出了一种更便宜的框架，用于组合语言模型。另外，已经收集并研究了模型优于另一种模型的选定历史记录。几乎所有这些都是非常常用的法语短语的开头。最后，通过使用此原理，我们在参数和困惑度方面获得了更好的Trigram模型。该模型是基于选定历史记录的二元组和三元组的组合。

著录项

来源
《Modeling and Using Context》|2001年|p.235-247|共13页
会议地点
作者
David Langlois; Kamel Smaieli; Jean-Paul Haton;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Combining multiple statistical methods to evaluate the performance of process-based vegetation models across three forest stands [J] . Joanna A. Horemans, Alexandra Henrot, Christine Delire, Central European Forestry Journal . 2017,第4期

机译：结合多种统计方法来评估三个林分中基于过程的植被模型的性能
2. Lasch-Born, P., Reyer, Ch., Suckow, F., Fran?ois, L., Ceulemans, R.: Combining multiple statistical methods to evaluate the performance of process-based vegetation models across three forest stands Cent. Eur. For. J., 63(2017) 153–172 | [J] . Horemans J, Henrot A, Delire Ch, Lesnicky casopis . 2017,第4期

机译：Lasch-Born，P.，Reyer，Ch。，Suckow，F.，Fran？ois，L.，Ceulemans，R .：结合多种统计方法，以评估三个森林林分中基于过程的植被模型的性能。欧元。对于。 J.，63（2017）153–172 |
3. Reasoning About Models Of Context: A Context-oriented Logical Language For Knowledge-based Context-aware Applications [J] . Hedda R. Schmidtke, Dongpyo Hong, Woontack Woo Revue d'Intelligence Artificielle . 2008,第5期

机译：关于上下文模型的推理：面向知识的上下文感知应用程序的面向上下文的逻辑语言
4. A new method based on context for combining statistical language models [C] . David Langlois, Kamel Smaili, Jean-Paul Haton International and Interdisciplinary Conference on Modeling and Using Context . 2001

机译：基于统计语言模型组合的新方法
5. Prediction of blast-induced ground vibration and rock fragmentation by statistical and combined finite-discrete element modelling methods for the McGill utility tunnel. [D] . Harvey, Simon G. E. 2016

机译：通过统计和有限元离散组合建模方法对McGill公用事业隧道的爆炸引起的地面振动和岩石破碎进行预测。
6. Huffman and Linear Scanning Methods with Statistical Language Models [O] . Brian Roark, Melanie Fried-Oken, Chris Gibbons -1

机译：统计语言模型的霍夫曼和线性扫描方法
7. Natural Language Understanding By Combining Statistical Methods And Extended Context-free Grammars [O] . Stefan Schwärzler, Joachim Schenk, Frank Wallhoff, 2013

机译：结合统计方法和扩展的无上下文语法的自然语言理解
8. Taxonomy and Evaluation for Systems Analysis Methodologies in a Workflow Context: Structured Systems Analysis Design Method (SSADM), Unified Modelling Language (UML), Unified Process, Soft Systems Methodology (SSM) and Organization Process Modelling (OPM [R] . Al-Humaidan, F., Rossiter, B. N. 2003

机译：工作流中的系统分析方法的分类和评估上下文：结构化系统分析设计方法（ssaDm），统一建模语言（UmL），统一过程，软系统方法（ssm）和组织过程建模（Opm

A New Method Based on Context for Combining Statistical Language Models

摘要

著录项

相似文献

相关主题

期刊订阅