Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling

机译：学习在开放词汇神经语言建模中创建和重用单词

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Fixed-vocabulary language models fail to account for one of the most characteristic statistical facts of natural language: the frequent creation and reuse of new word types. Although character-level language models offer a partial solution in that they can create word types not attested in the training corpus, they do not capture the "bursty" distribution of such words. In this paper, we augment a hierarchical LSTM language model that generates sequences of word tokens character by character with a caching mechanism that learns to reuse previously generated words. To validate our model we construct a new open-vocabulary language modeling corpus (the Multilingual Wikipedia Corpus; MWC) from comparable Wikipedia articles in 7 typologically diverse languages and demonstrate the effectiveness of our model across this range of languages.

机译：固定词汇语言模型无法解释自然语言的最典型统计事实之一：频繁创建和重复使用新单词类型。尽管字符级语言模型提供了部分解决方案，因为它们可以创建未经训练的语料库证明的单词类型，但它们不能捕获此类单词的“突发”分布。在本文中，我们增强了一种分层的LSTM语言模型，该模型通过学习重用先前生成的单词的缓存机制逐个字符地生成单词标记序列。为了验证我们的模型，我们从可比较的Wikipedia文章中用7种类型多样的语言构建了一个新的开放词汇语言建模语料库（Multilingual Wikipedia Corpus; MWC），并证明了我们的模型在该多种语言中的有效性。

著录项

来源
《Annual meeting of the Association for Computational Linguistics;Conference of the European Chapter of the Association for Computational Linguistics》|2017年|1492-1502|共11页
会议地点
作者
Kazuya Kawakami; Chris Dyer; Phil Blunsom;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Learning to read words in a new language shapes the neural organization of the prior languages [J] . Mei Leilei, Xue Gui, Lu Zhong-Lin, Neuropsychologia . 2014,第Null期

机译：学习阅读新语言中的单词会影响先前语言的神经组织
2. Lexical learning in a new language leads to neural pattern similarity with word reading in native language [J] . Li Huiling, Qu Jing, Chen Chuansheng, Human brain mapping . 2019,第1期

机译：新语言的词汇学习导致母语中的单词阅读神经模式相似度
3. Educational modelling language and learning design: new opportunities for instructional reusability and personalised learning [J] . Hans Hummel, Jocelyn Manderveld, Colin Tattersall, International Journal of Learning Technology . 2004,第1期

机译：教育建模语言和学习设计：教学可重用性和个性化学习的新机会
4. Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling [C] . Kazuya Kawakami, Chris Dyer, Phil Blunsom Annual meeting of the Association for Computational Linguistics . 2017

机译：学习在开放词汇神经语言建模中创建和重用单词
5. Word segmentation, word recognition, and word learning: A computational model of first language acquisition. [D] . Daland, Robert. 2009

机译：分词，单词识别和单词学习：母语习得的计算模型。
6. Learning to Read Words in a New Language Shapes the Neural Organization of the Prior Languages [O] . Leilei Mei, Gui Xue, Zhong-Lin Lu, -1

机译：学习阅读新语言中的单词塑造了先前语言的神经组织
7. Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling [O] . Kawakami, Kazuya, Dyer, Chris, Blunsom, Phil 2017

机译：学习创造和重用开放词汇神经语言中的单词造型

Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅