首页> 外文期刊>National Journal of System and Information Technology >COMPARISON OF STRING SIMILARITY ALGORITHMS TO MEASURE LEXICAL SIMILARITY
【24h】

COMPARISON OF STRING SIMILARITY ALGORITHMS TO MEASURE LEXICAL SIMILARITY

机译:字符串相似度算法与词性相似度的比较

获取原文
获取原文并翻译 | 示例
       

摘要

A string similarity represents the lexical similarity between two words. This can be further exploited to identify similarity between questions. Several string similarity algorithm exists in literature. In this paper the authors have implemented five string similarity algorithms viz. Dice coefficient, Jaccard similarity, Levenshtein distance, Jaro distance and Cosine similarity. The results of these algorithms are further compared with human judges to determine, which of them resembles the human way to dissimilarize the given strings. The experimentation is done over 1000 English word pairs.
机译:字符串相似度表示两个单词之间的词汇相似度。可以进一步利用它来确定问题之间的相似性。文献中存在几种字符串相似性算法。在本文中,作者已经实现了五种字符串相似性算法。骰子系数,Jaccard相似度,Levenshtein距离,Jaro距离和余弦相似度。将这些算法的结果进一步与人工判断者进行比较,以确定哪种算法类似于人工使给定字符串与众不同的方式。实验完成了1000多个英语单词对。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号