首页> 中文期刊>广西大学学报(自然科学版) >融合路径与信息内容的词语语义相似度计算

融合路径与信息内容的词语语义相似度计算

     

摘要

词语语义相似度计算是自然语言处理领域研究的基础.针对基于路径方法中普遍存在的密度不均匀性问题,提出融合路径距离与信息内容方法,通过一个平滑参数将路径和信息内容融合调整概念间的语义距离,使路径方法计算的相似度值更加合理.该方法具有较少的参数,能够避免其他方法因引入参数过多带来的过拟合问题,具有较好的通用性.实验结果表明:本文方法计算的相似度值与国际标准测试集人工判定值的皮尔逊相关系数达到了0. 852 3,具有较好的性能.同时对实验结果分析发现,结果受算法参数的影响甚小,表明本文提出的算法具有较强的鲁棒性.%The computation of word semantic similarity is the basis of natural language processing. Aiming at the problem of density inhomogeneity in path-based methods, a method of merging path distance and information content is proposed, which fuses the path and information content characteristics are fused through a smooth parameter to adjust the semantic distance between concepts and makes the similarity values calculated by path-based method more reasonable. The method has fewer parameters, thus avoids the problem of over-fitting caused by introducing too many parameters in other methods, and has a good universality. The experiments shows that the Pearson correlation coefficient between the similarity values from the presented method and the human judgments in the international standard test dataset has reached 0. 852 3, which means better performance. The analysis of experiment results shows that the results of the presented algorithm are very little influenced by the parameters of the algorithm, which indicates that it has stronger robust-ness.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号