首页> 外文期刊>International Journal of Internet Manufacturing and Services >A classification algorithm based on weighted ML-kNN for multi-label data
【24h】

A classification algorithm based on weighted ML-kNN for multi-label data

机译:基于加权ML-KNN的多标签数据的分类算法

获取原文
获取原文并翻译 | 示例
           

摘要

The ML-kNN algorithm uses naive Bayesian classification to modify the traditional kNN algorithm to solve multi-label classification problems. However, the ML-kNN algorithm is prone to misjudgement or incomplete judgment of the unseen instance's label set in two special cases: when the number of labels in the training set is not balanced and when the training instances are unevenly distributed in space. Therefore, a weighted ML-kNN algorithm (i.e., wML-kNN) is proposed in this paper. The main idea is to assign different weights to each label according to the proportion of labels and mutual information of the spatial distribution of unseen instances to training instances. This method can reduce the probability of misjudgement of the unseen instance's label set. A comparative study was conducted on four multi-label datasets that included review classification and three other published benchmark multi-label datasets: yeast gene function analysis, natural scene classification, and musical sentiment classification. The results show that the performance of the wML-kNN algorithm is better than the other four multi-label learning algorithms, including ML-kNN.
机译:ML-KNN算法使用Naive Bayesian分类来修改传统的KNN算法来解决多标签分类问题。然而,ML-KNN算法容易误诊或对两个特殊情况下的未完成实例标签集的判断或不完整判断:当训练集中的标签数不平衡时,培训实例在空间中不均匀分布时。因此,本文提出了一种加权ML-KNN算法(即,WML-KNN)。主要思想是根据未见实例的空间分布到培训实例的标签和互信息的比例为每个标签分配不同的权重。此方法可以降低未经说法的实例标签集的误判概率。对比较研究进行了四个多标签数据集,其中包括审查分类和三个其他公布的基准多标签数据集:酵母基因函数分析,自然场景分类和音乐情绪分类。结果表明,WML-KNN算法的性能优于其他四个多标签学习算法,包括ML-KNN。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号