首页> 外文期刊>Journal of information and computational science >An Improved Chinese Word Semantic Similarity Algorithm Based on CiLin
【24h】

An Improved Chinese Word Semantic Similarity Algorithm Based on CiLin

机译:一种基于CiLin的改进的中文单词语义相似度算法

获取原文
获取原文并翻译 | 示例

摘要

The CiLin is a famous semantic dictionary of Chinese synonyms; its structure and function are quite like the WordNet in English. This paper improves the existing algorithm of Chinese word semantic similarity based on CiLin, which integrates the word distance, the density of lowest common parent node and branch layer spacing. Firstly, the initial value of word semantic similarity is calculated through word distance, and then an adjusting parameter that depends on the lowest common parent node density n and the branch interval k is set to revise the initial value downward. Through the fourth root of an expression for the parameters k and n, the revision range of initial similarity can be limited below 16%, thus avoiding the unreasonable phenomenon that the word pairs with near distance have a low similarity because of a far branch interval. This method obtains an as high as 0.8464 value of Pearson correlation coefficient compared with artificial judgment for the word pair set of Miller & Charles.
机译:CiLin是著名的中文同义词语义词典;它的结构和功能很像英语中的WordNet。本文改进了现有的基于CiLin的汉字语义相似度算法,将字距,最小公母节点密度和分支层间距进行了综合。首先,通过单词距离计算单词语义相似度的初始值,然后设置取决于最低公共父节点密度n和分支间隔k的调整参数以向下修改初始值。通过参数k和n的表达式的第四根,可以将初始相似度的修订范围限制在16%以下,从而避免了不合理的现象,即距离较远的单词对由于分支距离较远而具有较低的相似度。与对Miller&Charles的单词对集进行人工判断相比,该方法可获得高达0.8464的Pearson相关系数值。

著录项

  • 来源
    《Journal of information and computational science》 |2015年第10期|3799-3807|共9页
  • 作者单位

    Guangxi Key Lab of Multi-source Information Mining & Security and College of Computer Science & Information Technology, Guangxi Normal University, Guilin 541004, China;

    Guangxi Key Lab of Multi-source Information Mining & Security and College of Computer Science & Information Technology, Guangxi Normal University, Guilin 541004, China;

    Guangxi Key Lab of Multi-source Information Mining & Security and College of Computer Science & Information Technology, Guangxi Normal University, Guilin 541004, China;

    Guangxi Key Lab of Multi-source Information Mining & Security and College of Computer Science & Information Technology, Guangxi Normal University, Guilin 541004, China;

    Guangxi Key Lab of Multi-source Information Mining & Security and College of Computer Science & Information Technology, Guangxi Normal University, Guilin 541004, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    CiLin; Semantic Similarity; Semantic Distance; Chinese Information Processing;

    机译:慈琳;语义相似度;语义距离中文信息处理;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号