How Is a Data-Driven Approach Better than Random Choice in Label Space Division for Multi-Label Classification?

Piotr Szymański; Tomasz Kajdanowicz; Kristian Kersting

首页> 外文期刊>Entropy >How Is a Data-Driven Approach Better than Random Choice in Label Space Division for Multi-Label Classification?

【24h】

How Is a Data-Driven Approach Better than Random Choice in Label Space Division for Multi-Label Classification?

机译：在多标签分类的标签空间划分中，数据驱动的方法如何比随机选择更好？

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose using five data-driven community detection approaches from social networks to partition the label space in the task of multi-label classification as an alternative to random partitioning into equal subsets as performed by RA k EL d . We evaluate modularity-maximizing using fast greedy and leading eigenvector approximations, infomap, walktrap and label propagation algorithms. For this purpose, we propose to construct a label co-occurrence graph (both weighted and unweighted versions) based on training data and perform community detection to partition the label set. Then, each partition constitutes a label space for separate multi-label classification sub-problems. As a result, we obtain an ensemble of multi-label classifiers that jointly covers the whole label space. Based on the binary relevance and label powerset classification methods, we compare community detection methods to label space divisions against random baselines on 12 benchmark datasets over five evaluation measures. We discover that data-driven approaches are more efficient and more likely to outperform RA k EL d than binary relevance or label powerset is, in every evaluated measure. For all measures, apart from Hamming loss, data-driven approaches are significantly better than RA k EL d ( α = 0 . 05 ), and at least one data-driven approach is more likely to outperform RA k EL d than a priori methods in the case of RA k EL d ’s best performance. This is the largest RA k EL d evaluation published to date with 250 samplings per value for 10 values of RA k EL d parameter k on 12 datasets published to date.

机译：我们建议使用五种来自社交网络的数据驱动的社区检测方法来在多标签分类任务中对标签空间进行分区，以替代将随机分区划分为相等的子集（如由RA k EL d执行）。我们使用快速贪婪和领先的特征向量逼近，信息图，助行器和标签传播算法来评估模块化最大化。为此，我们建议基于训练数据构造标签共现图（加权版本和未加权版本），并执行社区检测以划分标签集。然后，每个分区构成用于单独的多标签分类子问题的标签空间。结果，我们获得了一个多标签分类器的集合，它们共同覆盖了整个标签空间。基于二元相关性和标签功率集分类方法，我们比较了社区检测方法，以五种评估措施对12个基准数据集上的随机基线进行标签空间划分。我们发现，在每种评估的指标中，数据驱动的方法比二进制相关性或标签功率集更有效，并且更可能胜过RA k EL d。对于所有度量，除汉明损失外，数据驱动的方法明显优于RA k EL d（α= 0. 05），并且至少一种数据驱动的方法比先验方法更有可能胜过RA k EL d以RA k EL d的最佳性能为例。这是迄今为止发布的最大的RA k EL d评估，在迄今为止发布的12个数据集上，每个值有250个采样，用于RA k EL d参数k的10个值。

著录项

来源
《Entropy》 |2016年第8期|共页
作者
Piotr Szymański; Tomasz Kajdanowicz; Kristian Kersting;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类生理学;
关键词

相似文献

外文文献
中文文献
专利

1. Correlated Multi-label Classification with Incomplete Label Space and Class Imbalance [J] . Braytee Ali, Liu Wei, Anaissi Ali, ACM transactions on intelligent systems . 2019,第5期

机译：标签空间不完整且类别不平衡的相关多标签分类
2. Deep Correlation Structure Preserved Label Space Embedding for Multi-label Classification [J] . Kaixiang Wang, Ming Yang, Wanqi Yang, JMLR: Workshop and Conference Proceedings . 2018,第1期

机译：用于多标签分类的深度相关结构保留标签空间嵌入
3. Dependence maximization based label space dimension reduction for multi-label classification [J] . Ju-Jie Zhang, Min Fang, Hongchun Wang, Engineering Applications of Artificial Intelligence . 2015,第OCTa期

机译：基于依赖最大化的标签空间维数缩减，用于多标签分类
4. Random Forests with Random Projections of the Output Space for High Dimensional Multi-label Classification [C] . Arnaud Joly, Pierre Geurts, Louis Wehenkel European conference on machine learning and knowledge discovery in databases . 2014

机译：高维多标签分类的输出空间随机投影的随机森林
5. A Rule-Based Evolutionary Approach to Multi-Label Classification [D] . Nazmi, Shabnam. 2021

机译：基于规则的多标签分类的进化方法
6. Multi-label spacecraft electrical signal classification method based on DBN and random forest [O] . Ke Li, Nan Yu, Pengfei Li, -1

机译：基于dbn和随机森林的多标签航天器电信号分类方法
7. How is a data-driven approach better than random choice in label space division for multi-label classification? [O] . Szymański, Piotr, Kajdanowicz, Tomasz, Kersting, Kristian 2016

机译：数据驱动方法如何比标签空间中的随机选择更好多标签分类的划分？

How Is a Data-Driven Approach Better than Random Choice in Label Space Division for Multi-Label Classification?

摘要

著录项

相似文献

相关主题

期刊订阅