Within and Across Sentence Boundary Language Model

机译：句子边界语言内部和跨语言模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose two different language modeling approaches, namely skip trigram and across sentence boundary, to capture the long range dependencies. The skip trigram model is able to cover more predecessor words of the present word compared to the normal trigram while the same memory space is required. The across sentence boundary model uses the word distribution of the previous sentences to calculate the unigram probability which is applied as the emission probability in the word and the class model frameworks. Our experiments on the Penn Treebank [1] show that each of our proposed models and also their combination significantly outperform the baseline for both the word and the class models and their linear interpolation. The linear interpolation of the word and the class models with the proposed skip trigram and across sentence boundary models achieves 118.4 perplexity while the best state-of-the-art language model has a perplexity of 137.2 on the same dataset.

机译：在本文中，我们提出了两种不同的语言建模方法，即跳过三字组和跨句子边界，以捕获远程依赖关系。与正常的trigram相比，skip trigram模型能够覆盖当前单词的更多前代单词，同时需要相同的存储空间。跨句子边界模型使用先前句子的单词分布来计算字母组合概率，该单词组合概率将在单词和类模型框架中用作发射概率。我们在Penn Treebank [1]上的实验表明，我们提出的每个模型及其组合都明显优于单词和类模型及其线性插值的基线。使用建议的跳过三字母组和跨句子边界模型对单词和类别模型进行线性插值可达到118.4的困惑度，而最佳的最新语言模型在同一数据集上的困惑度为137.2。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2010》|2011年|p.1800-1803|共4页
会议地点
作者
Saeedeh Momtazi; Friedrich Faubel; Dietrich Klakow;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词

相似文献

外文文献
中文文献
专利

1. Commonality of neural representations of sentences across languages: Predicting brain activation during Portuguese sentence comprehension using an English-based model of brain function [J] . Yang Ying, Wang Jing, Bailer Cyntia, NeuroImage . 2017,第期

机译：跨语言的句子神经表征的共性：使用基于英语的脑功能模型预测葡萄牙句理解中的脑激活
2. Teach Your Robot Your Language! Trainable Neural Parser for Modeling Human Sentence Processing: Examples for 15 Languages [J] . Hinaut Xavier, Twiefel Johannes IEEE Transactions on Cognitive and Developmental Systems . 2020,第2期

机译：教你的机器人你的语言！用于建模人类句子处理的可训练神经解析器：15种语言的例子
3. The extended argument dependency model: A neurocognitive approach to sentence comprehension across languages [J] . Bornkessel I, Schlesewsky M Psychological Review . 2006,第4期

机译：扩展的论点依存模型：跨语言句子理解的神经认知方法
4. Within and Across Sentence Boundary Language Model [C] . Saeedeh Momtazi, Friedrich Faubel, Dietrich Klakow Annual conference of the International Speech Communication Association . 2010

机译：在句子边界语言模型中和跨句子
5. Syntactic representations of English in second language learners: An investigation of the process of English sentence production by bilingual speakers using a within-language (L2) structural priming paradigm. [D] . Kim, Sunfa. 2010

机译：第二语言学习者英语的句法表征：使用内部语言（L2）结构启动范例对双语说话者英语句子产生过程的调查。
6. iSentenizer-μ: Multilingual Sentence Boundary Detection Model [O] . Derek F. Wong, Lidia S. Chao, Xiaodong Zeng -1

机译：iSentenizer-μ：多语言句子边界检测模型
7. Seeing sentence boundaries: the production and perception of visual markers signalling boundaries in signed languages [O] . Fenlon J.J. 2010

机译：看到句子边界：视觉标记的产生和感知，标志着手语的边界
8. Language Modeling With Sentence-Level Mixtures. [R] . Iyer, R., Ostendorf, M., Rohlicek, J. R. 1994

机译：使用句子级混合的语言建模。

Within and Across Sentence Boundary Language Model

摘要

著录项

相似文献

相关主题

期刊订阅