【24h】

Robust Gram Embeddings

机译:鲁棒的革兰氏嵌入

获取原文

摘要

Word embedding models learn vectorial word representations that can be used in a variety of NLP applications. When training data is scarce, these models risk losing their generalization abilities due to the complexity of the models and the overfitting to finite data. We propose a regularized embedding formulation, called Robust Gram (RG), which penalizes overfitting by suppressing the disparity between target and context embeddings. Our experimental analysis shows that the RG model trained on small datasets generalizes better compared to alternatives, is more robust to variations in the training set, and correlates well to human similarities in a set of word similarity tasks.
机译:词嵌入模型学习可以在各种NLP应用程序中使用的矢量词表示形式。当训练数据稀缺时,由于模型的复杂性和对有限数据的过度拟合,这些模型可能会失去其泛化能力。我们提出了一种正规化的嵌入公式,称为Robust Gram(RG),它通过抑制目标嵌入和上下文嵌入之间的差异来惩罚过度拟合。我们的实验分析表明,与替代方法相比,在较小数据集上训练的RG模型具有更好的泛化能力,对训练集的变化更健壮,并且在一组单词相似性任务中与人的相似性具有很好的相关性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号