Maximum entropy language modeling with non-local dependencies.

机译：具有非本地依赖性的最大熵语言建模。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Stochastic language models are an important component in many natural language processing applications, such as automatic speech recognition and machine translation. A language model is a probability measure on word-sequences in a language. The most widely used models are N-gram models, which treat word sequences as a Markov process and predict the next word from the preceding N-1 words. For reasons of data sparseness, N is typically 2-4. N-gram models successfully “learn” local lexical dependencies but fail to capture syntactic well-formedness in sentences and semantic coherence within and across sentences.; To improve the performance of language models, two critical problems must be solved: first, deciding what kinds of long-range dependence should be used in language models, and second, determining how dependencies from different sources can be incorporated in a sound model. In this dissertation, a new language model is presented that overcomes some of the shortcomings of N-gram models by combining collocational dependencies with two sources of important long-range dependence: the syntactic structure of a sentence and the topic of a discourse. Maximum entropy techniques, which are particularly well suited for modeling diverse sources of statistical dependence, are used.; Previously known parameter estimation procedures for maximum entropy models have a computational cost that makes them impractical for large-scale applications, including the two language modeling tasks examined in this dissertation. Some fundamental algorithmic improvements in the parameter estimation procedure for maximum entropy models are presented. The computational complexity of the model parameter estimation is reduced by 2–3 orders of magnitude.; Significant improvements clue to the new language model over a trigram model are demonstrated in perplexity and in word error rate for the Switchboard and the Broadcast News tasks. Experimental results show that topic information is most helpful on content-bearing words, and syntactic structure is more useful when meaningful predicting words cannot be captured by N-grams. Experimental results also show that the topic dependence and the syntactic dependence are complementary and the gains from modeling them are nearly additive. A comparison of maximum entropy models with other models proposed in the literature is provided throughout the dissertation.

机译：随机语言模型是许多自然语言处理应用程序（例如自动语音识别和机器翻译）中的重要组成部分。语言模型是一种语言中单词序列的概率度量。使用最广泛的模型是N-gram模型，该模型将单词序列视为马尔可夫过程，并根据前面的N-1个单词预测下一个单词。由于数据稀疏的原因，N通常为2-4。 N-gram模型成功地“学习”了本地词汇的依存关系，但未能捕获句子中句法语法的正确性以及句子内部和句子之间的语义连贯性。为了提高语言模型的性能，必须解决两个关键问题：首先，确定在语言模型中应使用哪种远程依赖关系；其次，确定如何将来自不同来源的依赖关系合并到声音模型中。本文提出了一种新的语言模型，它通过将搭配依赖与重要的长期依赖的两个来源结合在一起，从而克服了N-gram模型的一些缺点：句子的句法结构和语篇主题。使用了最大熵技术，该技术特别适合于建模各种统计依赖性源。先前已知的用于最大熵模型的参数估计程序具有计算量，这使其不适用于大规模应用，包括本文中研究的两种语言建模任务。在最大熵模型的参数估计过程中，提出了一些基本的算法改进。模型参数估计的计算复杂度降低了2-3个数量级。在总机和广播新闻任务的困惑和单词错误率方面，证明了新语言模型相对于三字母组合词模型的重大改进线索。实验结果表明，主题信息对含内容的单词最有帮助，而有意义的预测单词不能被N-gram捕获时，句法结构则更有用。实验结果还表明，主题依赖和句法依赖是互补的，对它们的建模所获得的收益几乎是可加的。全文中将最大熵模型与文献中提出的其他模型进行了比较。

著录项

作者
Wu, Jun.;
展开▼
作者单位

The Johns Hopkins University.;

展开▼
授予单位 The Johns Hopkins University.;
学科 Computer Science.; Engineering Electronics and Electrical.
学位 Ph.D.
年度 2003
页码 211 p.
总页数 211
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Constructing Maximum Entropy Language Models For Movie Review Subjectivity Analysis [J] . Bo Chen, Hui He, Jun Guo Journal of experimental algorithmics . 2008,第2期

机译：电影评论主观性分析的最大熵语言模型的构建
2. Constructing Maximum Entropy Language Models for Movie Review Subjectivity Analysis [J] . Bo Chen, Hui He, Jun Guo 计算机科学技术学报（英文版） . 2008,第002期

机译：电影评论主观性分析的最大熵语言模型的构建
3. Combining Statistical Language Models via the Latent Maximum Entropy Principle [J] . SHAOJUN WANG, DALE SCHUURMANS, FUCHUN PENG, Machine Learning . 2005,第1a3期

机译：通过潜在最大熵原理组合统计语言模型
4. Stem-based Maximum Entropy Language Models for Inflectional Languages [C] . Dimitrios Oikonomidis, Vassilios Digalakis, International Speech Communication Association(ISCA) European Conference on Speech Communication and Technology - EUROSPEECH . 2003

机译：基于词干的折衷语言的最大熵语言模型
5. Constrained Maximum Entropy Models for Selecting Genotype Interactions Associated with Interval-Censored Failure Times and Methods for Power Calculation in a Three-Arm Four-Step Clinical Bioequivalence Study [D] . Yang, Aotian. 2018

机译：约束最大熵模型，用于选择与间隔缩短的故障时代和三臂四步临床生物等效研究中的功率计算方法相关的基因型相互作用
6. Cross Entropy of Neural Language Models at Infinity—A New Bound of the Entropy Rate [O] . Shuntaro Takahashi, Kumiko Tanaka-Ishii 2018

机译：无限内的神经语言模型的交叉熵 - 熵率的新界限
7. Adapting n-gram Maximum Entropy Language Models with Conditional Entropy Regularization [O] . Ariya Rastrow, Mark Dredze, Sanjeev Khudanpur 2012

机译：用条件熵正则化调整n元语法最大熵语言模型
8. Adaptive Statistical Language Modeling; A Maximum Entropy Approach. [R] . Rosenfeld, R. 1994

机译：自适应统计语言建模;最大熵方法。

Maximum entropy language modeling with non-local dependencies.

摘要

著录项

相似文献

相关主题

期刊订阅