首页> 外文会议>Annual conference on Neural Information Processing Systems >Learning word embeddings efficiently with noise-contrastive estimation
【24h】

Learning word embeddings efficiently with noise-contrastive estimation

机译:用噪声对比估计有效地学习单词嵌入

获取原文

摘要

Continuous-valued word embeddings learned by neural language models have recently been shown to capture semantic and syntactic information about words very well, setting performance records on several word similarity tasks. The best results are obtained by learning high-dimensional embeddings from very large quantities of data, which makes scalability of the training method a critical factor. We propose a simple and scalable new approach to learning word embeddings based on training log-bilinear models with noise-contrastive estimation. Our approach is simpler, faster, and produces better results than the current state-of-the-art method. We achieve results comparable to the best ones reported, which were obtained on a cluster, using four times less data and more than an order of magnitude less computing time. We also investigate several model types and find that the embeddings learned by the simpler models perform at least as well as those learned by the more complex ones.
机译:最近被神经语言模型学习的连续值单词嵌入式,以捕获有关单词的语义和句法信息,并在几个单词相似性任务上设置性能记录。最好的结果是由非常大量的数据,这使得训练方法的一个关键因素的可扩展性学习高维的嵌入获得。我们提出了一个简单,可扩展的新的方法来学习基于训练日志双线性模型与噪声对比估计字的嵌入。我们的做法是更简单,更快,并产生比目前国家的最先进的方法更好的结果。我们实现了与报告的最佳群体的结果相当的结果,其中使用了四倍的数据,超过了计算时间的数量级。我们还调查了多种模型类型,并发现更简单的模型学习的嵌入物至少以及由更复杂的模型中学到的嵌入式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号