Interpolated Spectral NGram Language Models

机译：内插光谱NGram语言模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Spectral models for learning weighted non-deterministic automata have nice theoretical and algorithmic properties. Despite this, it has been challenging to obtain competitive results in language modeling tasks, for two main reasons. First, in order to capture long-range dependencies of the data, the method must use statistics from long substrings, which results in very large matrices that are difficult to decompose. The second is that the loss function behind spectral learning, based on moment matching, differs from the probabilistic metrics used to evaluate language models. In this work we employ a technique for scaling up spectral learning, and use interpolated predictions that are optimized to maximize perplexity. Our experiments in character-based language modeling show that our method matches the performance of state-of-the-art ngram models, while being very fast to train.

机译：用于学习加权非确定性自动机的光谱模型具有良好的理论和算法特性。尽管如此，出于两个主要原因，在语言建模任务中获得竞争性结果一直是一个挑战。首先，为了捕获数据的长期依赖性，该方法必须使用来自长子字符串的统计信息，这将导致很难分解的非常大的矩阵。第二个是基于矩匹配的频谱学习背后的损失函数不同于用于评估语言模型的概率度量。在这项工作中，我们采用了一种扩展频谱学习的技术，并使用了经过优化的插值预测，以最大程度地提高困惑度。我们在基于字符的语言建模中的实验表明，我们的方法与最新的ngram模型的性能相匹配，并且训练起来非常快。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|5926-5930|共5页
会议地点
作者
Ariadna Quattoni; Xavier Carreras;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Discriminatively trained continuous Hindi speech recognition system using interpolated recurrent neural network language modeling [J] . Dua Mohit, Aggarwal R. K., Biswas Mantosh Neural computing & applications . 2019,第10期

机译：使用内插复发性神经网络语言建模判别训练的连续印地语语音识别系统
2. Tungsten anode spectral model using Interpolating cubic splines: Unfiltered x-ray spectra from 20 kV to 640 kV [J] . Andrew M. Hernez, John M. Boone Medical Physics . 2014,第4期

机译：使用插值三次样条的钨阳极光谱模型：20 kV至640 kV的未过滤X射线光谱
3. Tungsten anode spectral model using Interpolating cubic splines: Unfiltered x-ray spectra from 20 kV to 640 kV [J] . Andrew M. Hernez, John M. Boone Medical Physics . 2014,第4期

机译：使用内插三次样条的钨阳极光谱模型：20 kV至640 kV的未过滤X射线光谱
4. Interpolated Spectral NGram Language Models [C] . Ariadna Quattoni, Xavier Carreras Annual meeting of the Association for Computational Linguistics . 2019

机译：内插光谱ngram语言模型
5. Accuracy of interpolated bathymetric digital elevation models. [D] . Amante, Christopher Joseph. 2012

机译：内插测深数字高程模型的准确性。
6. Tungsten anode spectral model using interpolating cubic splines: Unfiltered x-ray spectra from 20 kV to 640 kV [O] . Andrew M. Hernandez, John M. Boone -1

机译：使用插值三次样条的钨阳极光谱模型：20 kV至640 kV的未过滤X射线光谱
7. A leap from short range language models to middle range modeling using dependency ngrams [O] . 山本幹雄, ヤマモトミキオ 2014

机译：使用依赖ngram从短距离语言模型到中距离建模的飞跃

Interpolated Spectral NGram Language Models

摘要

著录项

相似文献

相关主题

期刊订阅