首页> 外文会议>Conference on empirical methods in natural language processing >Predicting and interpreting embeddings for out of vocabulary words in downstream tasks
【24h】

Predicting and interpreting embeddings for out of vocabulary words in downstream tasks

机译:预测和解释嵌入下游任务中的词汇单词的嵌入

获取原文

摘要

We propose a novel way to handle out of vocabulary (OOV) words in downstream natural language processing (NLP) tasks. We implement a network that predicts useful em-beddings for OOV words based on their morphology and on the context in which they appear. Our model also incorporates an attention mechanism indicating the focus allocated to the left context words, the right context words or the word's characters, hence making the prediction more interpretable. The model is a "drop-in" module that is jointly trained with the downstream task's neural network, thus producing embeddings specialized for the task at hand. When the task is mostly syntactical, we observe that our model aims most of its attention on surface form characters. On the other hand, for tasks more semantical, the network allocates more attention to the surrounding words. In all our tests, the module helps the network to achieve better performances in comparison to the use of simple random embeddings.
机译:我们提出了一种新颖的方式来处理从下游自然语言处理(NLP)任务中的词汇(OOV)词。我们实现了一个网络,该网络基于它们的形态和它们出现的上下文来预测OOV单词的有用EM-BEDDINGS。我们的模型还包含一个注意机制,指示分配给左上上下文单词的焦点,正确的上下文单词或单词字符,因此使预测更加解释。该模型是一个“辍学”模块,与下游任务的神经网络共同培训,从而产生专门用于手头任务的嵌入品。当任务大多是句法时,我们观察到我们的模型旨在对表面形态字符的大部分注意。另一方面,对于更多语义的任务,网络将更多地关注周围的单词。在所有测试中,模块有助于网络与使用简单随机嵌入的使用相比,实现更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号