Multi-label thresholding for cost-sensitive classification

Alotaibi Reem; Flach Peter

首页> 外文期刊>Neurocomputing >Multi-label thresholding for cost-sensitive classification

【24h】

Multi-label thresholding for cost-sensitive classification

机译：用于成本敏感分类的多标签阈值

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multi-label classification associates each instance with a set of labels which reflects the nature of a wide range of real-world applications. However, existing approaches assume that all labels have the same misclassification cost, whereas in real-world problems different types of misclassification errors have different costs, which are generally unknown in the training context or might change from one context to another. Thus, there is a demand for cost-sensitive classification methods that minimise the average misclassification cost rather than error rates or counts. In this paper, we adopt a simple yet general method, called thresholding, which applies to most classification algorithms to adapt them to cost-sensitive multi-label classification. This paper investigates current threshold choice approaches for multi-label classification. It explores the choice of single and multiple thresholds and extends some of the current techniques to support multi-label problems. Moreover, it proposes cost curves and scatter diagrams for performance evaluation in the multi-label setting. Experimental evaluation on 13 multi-label datasets demonstrates that there is no significant loss by adjusting a global threshold rather than a per-label threshold considering different misclassification costs across labels. Although tuning multiple thresholds is the obvious solution, the global threshold can also be valid.Multi-label classification associates each instance with a set of labels which reflects the nature of a wide range of real-world applications. However, existing approaches assume that all labels have the same misclassification cost, whereas in real-world problems different types of misclassification errors have different costs, which are generally unknown in the training context or might change from one context to another. Thus, there is a demand for cost-sensitive classification methods that minimise the average misclassification cost rather than error rates or counts. In this paper, we adopt a simple yet general method, called thresholding, which applies to most classification algorithms to adapt them to cost-sensitive multi-label classification. This paper investigates current threshold choice approaches for multi-label classification. It explores the choice of single and multiple thresholds and extends some of the current techniques to support multi-label problems. Moreover, it proposes cost curves and scatter diagrams for performance evaluation in the multi-label setting. Experimental evaluation on 13 multi-label datasets demonstrates that there is no significant loss by adjusting a global threshold rather than a per-label threshold considering different misclassification costs across labels. Although tuning multiple thresholds is the obvious solution, the global threshold can also be valid.(c) 2020 Elsevier B.V. All rights reserved.

机译：多标签分类将每个实例与一组标签相关联，这些标签反映了各种现实应用程序的性质。然而，现有方法假设所有标签都具有相同的错误分类成本，而在实际问题中，不同类型的错误分类错误具有不同的成本，这在培训背景中通常是未知的，或者可能会从一个上下文变为另一个语境。因此，需要对成本敏感的分类方法来最小化平均错误分类成本而不是错误率或计数。在本文中，我们采用一种简单但通用的方法，称为阈值，这适用于大多数分类算法，以使它们适应成本敏感的多标签分类。本文调查了多标签分类的电流阈值选择方法。它探讨了单个和多个阈值的选择，并扩展了一些当前技术来支持多标签问题。此外，它提出了用于多标签设置中的性能评估的成本曲线和散点图。对13个多标签数据集的实验评估表明，通过调整全局阈值而不是考虑到标签不同的错误分类成本，通过调整全局阈值而没有显着损失。虽然调整多个阈值是明显的解决方案，但是全局阈值也可以是有效的.Multi-Label分类将每个实例与一组标签相关联，这些标签反映了各种现实应用程序的性质。然而，现有方法假设所有标签都具有相同的错误分类成本，而在实际问题中，不同类型的错误分类错误具有不同的成本，这在培训背景中通常是未知的，或者可能会从一个上下文变为另一个语境。因此，需要对成本敏感的分类方法来最小化平均错误分类成本而不是错误率或计数。在本文中，我们采用一种简单但通用的方法，称为阈值，这适用于大多数分类算法，以使它们适应成本敏感的多标签分类。本文调查了多标签分类的电流阈值选择方法。它探讨了单个和多个阈值的选择，并扩展了一些当前技术来支持多标签问题。此外，它提出了用于多标签设置中的性能评估的成本曲线和散点图。对13个多标签数据集的实验评估表明，通过调整全局阈值而不是考虑到标签不同的错误分类成本，通过调整全局阈值而没有显着损失。虽然调整多个阈值是明显的解决方案，但全局阈值也可以有效。（c）2020 Elsevier B.v.保留所有权利。

著录项

来源
《Neurocomputing》 |2021年第14期|232-247|共16页
作者
Alotaibi Reem; Flach Peter;
展开▼
作者单位

King Abdulaziz Univ Fac Comp & Informat Technol Jeddah Saudi Arabia;

Univ Bristol Intelligent Syst Lab Bristol Avon England;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Multi-label classification; Cost-sensitive learning; Threshold choice methods; Global threshold; Context; Misclassification costs;

机译：多标签分类;成本敏感的学习;阈值选择方法;全局阈值;上下文;错误分类成本;

相似文献

外文文献
中文文献
专利

1. Dynamic principal projection for cost-sensitive online multi-label classification [J] . Chu Hong-Min, Huang Kuan-Hao, Lin Hsuan-Tien Machine Learning . 2019,第8a9期

机译：成本敏感的在线多标签分类的动态主投影
2. Advances in Cost-sensitive Multiclass and Multi-label Classification [J] . Hsuan-Tien Lin SIGKDD explorations . 2019,第Udisk期

机译：成本敏感的多字符和多标签分类的进步
3. Cost-sensitive classifier chains: Selecting low-cost features in multi-label classification [J] . Teisseyre Pawel, Zufferey Damien, Slomka Marta Pattern Recognition: The Journal of the Pattern Recognition Society . 2019,第期

机译：成本敏感的分类器链：在多标签分类中选择低成本功能
4. Multi-label Classification with Feature-aware Cost-sensitive Label Embedding [C] . Hsien-Chun Chiu, Hsuan-Tien Lin Conference on Technologies and Applications of Artificial Intelligence . 2018

机译：具有功能感知的成本敏感标签嵌入的多标签分类
5. Adversarial Approach to Cost-Sensitive Classification and Sequence Tagging [D] . Asif, Kaiser Newaj. 2019

机译：对成本敏感分类和序列标记的对抗方法
6. Fine-Grained Emotion Detection in Suicide Notes: A Thresholding Approach to Multi-Label Classification [O] . Kim Luyckx, Frederik Vaassen, Claudia Peersman, 2012

机译：自杀笔记中细腻的情绪检测：一种多标签分类的阈值方法
7. Dynamic principal projection for cost-sensitive online multi-label classification [O] . Hong-Min Chu, Kuan-Hao Huang, Hsuan-Tien Lin 2019

机译：具有成本敏感在线多标签分类的动态主预测

Multi-label thresholding for cost-sensitive classification

摘要

著录项

相似文献

相关主题

期刊订阅