首页> 外文会议>International Joint Conference on Neural Networks >Estimator Vectors: OOV Word Embeddings based on Subword and Context Clue Estimates
【24h】

Estimator Vectors: OOV Word Embeddings based on Subword and Context Clue Estimates

机译:估计向量:基于子词和上下文线索估计的OOV词嵌入

获取原文

摘要

Semantic representations of words have been successfully extracted from unlabeled corpuses using neural network models like word2vec. These representations are generally high quality and are computationally inexpensive to train, making them popular. However, these approaches generally fail to approximate out of vocabulary (OOV) words, a task humans can do quite easily, using word roots and context clues. This paper proposes a neural network model that learns high quality word representations, subword representations, and context clue representations jointly. Learning all three types of representations together enhances the learning of each, leading to enriched word vectors, along with strong estimates for OOV words, via the combination of the corresponding context clue and subword embeddings. Our model, called Estimator Vectors (EV), learns strong word embeddings and is competitive with state of the art methods for OOV estimation.
机译:使用诸如word2vec之类的神经网络模型已成功地从未标记的语料库中提取了单词的语义表示。这些表示通常是高质量的,并且在计算上训练起来很便宜,因此很受欢迎。但是,这些方法通常无法从词汇(OOV)词中近似,这是人类可以很容易地使用词根和上下文线索来完成的任务。本文提出了一种神经网络模型,该模型可以共同学习高质量的单词表示,子单词表示和上下文线索表示。通过对应上下文线索和子词嵌入的组合,一起学习所有三种类型的表示形式可增强每种表示的学习,从而导致单词矢量丰富,以及对OOV单词的强大估计。我们的模型称为Estimator Vectors(EV),它学习强大的词嵌入功能,并且与用于OOV估计的最新方法具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号