Predicting and interpreting embeddings for out of vocabulary words in downstream tasks

机译：预测和解释嵌入下游任务中的词汇单词的嵌入

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a novel way to handle out of vocabulary (OOV) words in downstream natural language processing (NLP) tasks. We implement a network that predicts useful em-beddings for OOV words based on their morphology and on the context in which they appear. Our model also incorporates an attention mechanism indicating the focus allocated to the left context words, the right context words or the word's characters, hence making the prediction more interpretable. The model is a "drop-in" module that is jointly trained with the downstream task's neural network, thus producing embeddings specialized for the task at hand. When the task is mostly syntactical, we observe that our model aims most of its attention on surface form characters. On the other hand, for tasks more semantical, the network allocates more attention to the surrounding words. In all our tests, the module helps the network to achieve better performances in comparison to the use of simple random embeddings.

机译：我们提出了一种新颖的方式来处理从下游自然语言处理（NLP）任务中的词汇（OOV）词。我们实现了一个网络，该网络基于它们的形态和它们出现的上下文来预测OOV单词的有用EM-BEDDINGS。我们的模型还包含一个注意机制，指示分配给左上上下文单词的焦点，正确的上下文单词或单词字符，因此使预测更加解释。该模型是一个“辍学”模块，与下游任务的神经网络共同培训，从而产生专门用于手头任务的嵌入品。当任务大多是句法时，我们观察到我们的模型旨在对表面形态字符的大部分注意。另一方面，对于更多语义的任务，网络将更多地关注周围的单词。在所有测试中，模块有助于网络与使用简单随机嵌入的使用相比，实现更好的性能。

著录项

来源
《Conference on empirical methods in natural language processing》|2018年|xviii 386 p.|共3页
会议地点
作者
Nicolas Garneau; Jean-Samuel Leboeuf; Luc Lamontagne;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. A two-pass approach for handling out-of-vocabulary words in a large vocabulary recognition task [J] . Odette Scharenborg, Stephanie Seneff, Lou Boves Computer speech and language . 2007,第1期

机译：在大型词汇识别任务中处理词汇外单词的两遍方法
2. Explicit Instruction of Context-embedded Hyperlinked Thematic Words and Vocabulary Recall [J] . Hassan Soleimani, Maryam Molla Esmaeili Procedia - Social and Behavioral Sciences . 2013,第2期

机译：上下文相关的超链接主题词和词汇记忆的明确说明
3. How to use a wide variety of words in telling a story with a small vocabulary: cognitive predictors of lexical selection for simultaneous bilingual children [J] . Language, cognition and neuroscience . 2020,第3期

机译：如何在讲述一个小词汇表中讲述一个故事的各种各样的词：同时双语儿童的词汇选择的认知预测因子
4. Predicting and interpreting embeddings for out of vocabulary words in downstream tasks [C] . Nicolas Garneau, Jean-Samuel Leboeuf, Luc Lamontagne 1st EMNLP workshop blackboxNLP: analyzing and interpreting neural networks for NLP 2018 . 2018

机译：预测和解释下游任务中词汇外的嵌入
5. Improved GloVe Word Embedding Using Linear Weighting Scheme for Word Similarity Tasks [D] . Lu, Qinglan. 2021

机译：使用线性加权方案进行改进的手套单词嵌入单词相似性任务
6. Nonword Repetition and Vocabulary Knowledge as Predictors of Childrens Phonological and Semantic Word Learning [O] . Suzanne M. Adlof, Hannah Patten -1

机译：非单词重复和词汇知识作为儿童语音和语义单词学习的预测指标
7. Predicting and interpreting embeddings for out of vocabulary words in downstream tasks [O] . Nicolas Garneau, Jean-Samuel Leboeuf, Luc Lamontagne 2018

机译：预测和解释嵌入下游任务中的词汇单词的嵌入

Predicting and interpreting embeddings for out of vocabulary words in downstream tasks

摘要

著录项

相似文献

相关主题

期刊订阅