首页> 外文期刊>Journal of intelligent & fuzzy systems: Applications in Engineering and Technology >A new text-based w-distance metric to find the perfect match between words
【24h】

A new text-based w-distance metric to find the perfect match between words

机译:基于新文本的W距离度量标准,可以找到单词之间的完美匹配

获取原文
获取原文并翻译 | 示例
           

摘要

The k-NN algorithm is an instance-based learning algorithm which is widely used in the data mining applications. The core engine of the k-NN algorithm is the distance/similarity function. The performance of the k-NN algorithm varies with the selection of distance function. The traditional distance/similarity functions in k-NN do not perfectly handle the mix-mode words such as when one string has multiple substrings/words. For example, a two-word string of "Employee Name", a one-word string of "Name" or more than one word such as, "Name of Employee". This ambiguity is faced by different distance/similarity functions causing difficulties in finding the perfect match of words. To improve the perfectmatch calculation functionality in the traditional k-NN algorithm, a new similarity distance metric is developed and named as word-distance (w-distance). The perfect match will help us to identify the exact required value. The proposed w-distance is a hybrid of distance and similarity in nature because it is to handle dissimilarity and similarity features of strings at the same time. The simulation results showed that w-distance has a better impact on the performance of the k-NN algorithm as compared to the Euclidean distance and the cosine similarity.
机译:K-Nn算法是基于实例的学习算法,其广泛用于数据挖掘应用程序。 K-NN算法的核心引擎是距离/相似性。 K-NN算法的性能随着距离功能的选择而变化。 K-NN中的传统距离/相似性功能不完美地处理混合模式单词,例如当一个字符串有多个子串/单词时。例如,一个双字字符串的“员工名称”,“名称”的单词字符串或多个单词,例如“员工的名称”。这种模糊性面临不同的距离/相似性函数,导致找到完美匹配的单词的困难。为了在传统的K-NN算法中提高PerfectMatch计算功能,开发了一种新的相似距离度量并命名为文字距离(W距离)。完美匹配将有助于我们识别确切所需的值。所提出的W距离是距离和相似性的混合性质,因为它是同时处理弦的相似性和相似性。仿真结果表明,与欧几里德距离和余弦相似度相比,W距离对K-NN算法的性能具有更好的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号