【24h】

Exploiting Instance Relationship for Effective Extreme Multi-label Learning

机译:利用实例关系进行有效的极端多标签学习

获取原文

摘要

Extreme multi-label classification is an important data mining technique, which can be used to label each unseen instance with a subset of labels from a large label set. It has wide applications and many methods have been proposed in recent years. Existing methods either seek to compress label space or train a classifier based on instances' features, among which tree-based classifiers enjoy the advantages of better efficiency and accuracy. In many real world applications, instances are not independent and relationship between instances is very useful information. However, how to utilize relationship between instances in extreme multi-label classification is less studied. Exploiting such relationship may help improve prediction accuracy, especially in the circumstance that feature space is very sparse. In this paper, we study how to utilize the similarity between instances to build more accurate tree-based extreme multi-label classifiers. To this end, we introduce the utilization of relationship between instances to state-of-the-art models in two ways: feature engineering and collaborative labeling. Extensive experiments conducted on three real world datasets demonstrate that our proposed method achieves higher accuracy than the state-of-the-art models.
机译:极端的多标签分类是一项重要的数据挖掘技术,可用于使用大型标签集中的标签子集为每个看不见的实例添加标签。它具有广泛的应用,并且近年来已经提出了许多方法。现有方法要么寻求压缩标签空间,要么根据实例的特征训练分类器,其中基于树的分类器具有更高的效率和准确性。在许多实际应用中,实例不是独立的,实例之间的关系是非常有用的信息。然而,在极端的多标签分类中如何利用实例之间的关系进行的研究较少。利用这种关系可能有助于提高预测准确性,尤其是在特征空间非常稀疏的情况下。在本文中,我们研究了如何利用实例之间的相似性来构建更准确的基于树的极端多标签分类器。为此,我们通过两种方式将实例之间的关系引入到最新模型中:特征工程和协作标记。在三个真实世界的数据集上进行的大量实验表明,我们提出的方法比最新模型具有更高的准确性。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号