COMPARING RNNS AND LOG-LINEAR INTERPOLATION OF IMPROVED SKIP-MODEL ON FOUR BABEL LANGUAGES: CANTONESE, PASHTO, TAGALOG, TURKISH

机译：比较RNN和Log-Log-Log-Log-Log-Log-Log-Linear插值在四种Babel语言上的改进跳过模型：粤语，普什图，Tagalog，土耳其语

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recurrent neural networks (RNNs) are a very recent technique to model long range dependencies in natural languages. They have clearly outperformed trigrams and other more advanced language modeling techniques by using non-linearly modeling long range dependencies. An alternative is to use log-linear interpolation of skip models (i.e. skip bigrams and skip trigrams). The method as such has been published earlier. In this paper we investigate the impact of different smoothing techniques on the skip models as a measure of their overall performance. One option is to use automatically trained distance clusters (both hard and soft) to increase robustness and to combat sparseness in the skip model We also investigate alternative smoothing techniques on word level. For skip bigrams when skipping a small number of words Kneser-Ney smoothing (KN) is advantageous. For a larger number of words being skipped Dirichlet smoothing performs better. In order to exploit the advantages of both KN and Dirichlet smoothing we propose a new unified smoothing technique. Experiments are performed on four Babel languages: Cantonese, Pashto, Tagalog and Turkish. RNNs and log-linearly interpolated skip models are on par if the skip models are trained with standard smoothing techniques. Using the improved smoothing of the skip models along with distance clusters, we can clearly outperform RNNs by about 8-11 % in perplexity across all four languages.

机译：经常性的神经网络（RNNS）是一种最近的技术，用于模拟自然语言的长距离依赖性。它们通过使用非线性建模的长距离依赖性，它们显然优先表现出了三进体和其他更先进的语言建模技术。替代方案是使用跳过模型的日志线性插值（即跳过Bigrams和Skip Trigrams）。此类方法之前发布。在本文中，我们调查了不同平滑技术对跳过模型的影响，作为其整体性能的衡量标准。一种选择是使用自动训练的距离簇（既硬和软），以增加稳健性并在跳过模型中打击稀疏，我们还研究了单词级别的替代平滑技术。对于跳过少量单词的跳过跳过巨大的单词，Ney平滑（KN）是有利的。对于跳过更大数量的单词，Dirichlet Smoothing执行更好。为了利用KN和Dirichlet平滑的优点，我们提出了一种新的统一平滑技术。实验是对四个Babel语言进行的：粤语，普什图，Tagalog和土耳其语。如果跳过型号培训，则RNN和Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Log-Interpatal ock型号具有标准平滑技术。使用跳过模型的改进平滑以及距离集群，我们可以在所有四种语言中显然占据8-11％的困惑。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2013年||共5页
会议地点
作者
Mittul Singh; Dietrich Klakow;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
RNNs; Log-linear interpolation; Skip models; Smoothing; Under researched languages;

机译：rnns;log-linear插值;跳过模型;平滑;在研究的语言下;

相似文献

外文文献
中文文献

1. Quantifying Long Range Dependence in Language and User Behavior to improve RNNs [J] . Francois Belletti, Minmin Chen, Ed H. Chi SIGKDD explorations . 2019,第Udisk期

机译：量化语言和用户行为中的长距离依赖性，以改善RNN
2. G2Basy: A framework to improve the RNN language model and ease overfitting problem [J] . Lu Yuwen, Shuyu Chen, Xiaohan Yuan PLoS One . 2021,第4期

机译：G2Basy：一个改善RNN语言模型的框架，缓解过度拟合问题
3. COMPARING TWO DIFFERENT SPATIAL INTERPOLATION APPROACHES TO CHARACTERIZE SPATIAL VARIABILITY OF SOIL PROPERTIES IN TUZ LAKE BASIN - TURKEY [J] . Taha Gorji, Ugur Alganci, Elif Sertel, Fresenius environmental bulletin . 2019,第2期

机译：比较两种不同的空间插值方法表征土耳其土孜湖流域土壤性质的空间变异性。
4. Comparing RNNs and log-linear interpolation of improved skip-model on four Babel languages: Cantonese, Pashto, Tagalog, Turkish [C] . Singh Mittul, Klakow Dietrich IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：在四种Babel语言上比较RNN和改进的跳跃模型的对数线性插值：四种粤语，普什图语，他加禄语，土耳其语
5. G2Basy: A framework to improve the RNN language model and ease overfitting problem [O] . Lu Yuwen, Shuyu Chen, Xiaohan Yuan 2021

机译：G2Basy：一种提高RNN语言模型的框架缓解过度装备问题
6. COMPARING RNNS AND LOG-LINEAR INTERPOLATION OF IMPROVED SKIP-MODEL ON FOUR BABEL LANGUAGES: CANTONESE, PASHTO, TAGALOG, TURKISH [O] . Mittul Singh, Dietrich Klakow 2014

机译：比较四种语言语言中改进的跳过模型的RNNs和对数线插值：CaNTONEsE，pasHTO，TaGaLOG，TURKIsH

COMPARING RNNS AND LOG-LINEAR INTERPOLATION OF IMPROVED SKIP-MODEL ON FOUR BABEL LANGUAGES: CANTONESE, PASHTO, TAGALOG, TURKISH

摘要

著录项

相似文献

相关主题

期刊订阅