首页> 中文期刊> 《计算机应用与软件》 >一种基于Word2Vec的训练效果优化策略研究

一种基于Word2Vec的训练效果优化策略研究

         

摘要

Word2Vec是谷歌在2013年开源的一款语言处理工具包,它能够在基于神经网络训练语言模型的同时将词表示成实数值向量,并根据向量空间余弦距离来寻找语义相似度高的词,训练效率较高.在应用Word2Vec训练词向量的过程中,对其中可能影响Word2Vec词向量训练的中文分词和算法选择环节进行试验,配合深入解析部分核心源代码,发现能使训练效果最优的策略,使得Word2Vec的性能获得一定的提升,为下一步的应用提供了更好的词向量.%Word2Vec is a language process toolkit which was outsourced by Google in 2013.It can express words as real numbers based on neural network training language model,and find the words with high similarity in terms of vector cosine distance,and the training efficiency is higher.In this paper,we applied Word2Vec to find the optimized scheme in the task about training word vector with different segmented tools and mixed algorithms,meanwhile analysed architecture of the source code.Empirical results showed some factors and scheme which could improve the training performance,provided the higher quality word vectors for more applications.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号