首页> 中文期刊>计算机科学与探索 >融合词向量的多特征句子相似度计算方法研究

融合词向量的多特征句子相似度计算方法研究

     

摘要

Based on the summarization of sentence similarity computing methods,this paper applies 34 000 pieces of texts of People's Daily to train word vector space model for semantic similarity computing.Then,based on the trained word vector model,this paper designs a multi-feature sentence similarity computing method,which takes both word and sentence structure features into consideration.Firstly,the method takes note of possible effects of the number of overlapping words and word continuity,and then applies word vector model to calculate the semantic similarity of nonoverlapping words.Regarding the aspect of sentence structure,the method takes both overlapping word order and sentence length conformity into consideration.Finally,this paper designs and implements four different sentence similarity calculating methods,and further develops an experimental system.The experimental results show that the method proposed in this paper can get satisfactory results and the combination and optimization upon the features of words and sentence structures can improve the accuracy of sentence similarity calculating.%在归纳常见的句子相似度计算方法后,基于《人民日报》3.4万余份文本训练了用于语义相似度计算的词向量模型,并设计了一种融合词向量的多特征句子相似度计算方法.该方法在词方面,考虑了句子中重叠的词数和词的连续性,并运用词向量模型测量了非重叠词间的相似性;在结构方面,考虑了句子中重叠词的语序和两个句子的长度一致性.实验部分设计实现了4种句子相似度计算方法,并开发了相应的实验系统.结果表明:提出的算法能够取得相对较好的实验结果,对句子中词的语义特征和句子结构特征进行组合处理和优化,能够提升句子相似度计算的准确性.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号