首页> 外文会议>International Workshop on Big Data Management and Service >On the Impact of the Length of Subword Vectors on Word Embeddings
【24h】

On the Impact of the Length of Subword Vectors on Word Embeddings

机译:关于次字矢量长度对单词嵌入的影响

获取原文

摘要

This paper hypothesizes that better word embeddings can be learned by representing words and subwords by different lengths of vectors. To investigate the impact of the length of subword vectors on word embeddings, this paper proposes a model based on the Subword Information Skip-gram model. 'The experiments on two datasets with respect to two tasks show that the proposed model outperforms 6 baselines, which confirms the aforementioned hypothesis. In addition, we also observe that, within a specific range, a higher dimensionality of subword vectors always improve the quality of word embeddings.
机译:本文通过不同长度的向量代表单词和子字可以通过表示不同的向量来学习更好的单词嵌入。为了调查次字矢量长度对单词嵌入的影响,本文提出了一种基于子字信息Skip-Gram模型的模型。 “关于两个任务的两个数据集上的实验表明,所提出的模型优于6个基线,这证实了上述假设。此外,我们还观察到,在特定范围内,次字矢量的更高维度始终提高Word Embeddings的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号