The goal of multi-label classification is to predict multiple labels per data point simultaneously. Real-world applications tend to have high-dimensional label spaces, employing hundreds or even thousands of labels. While these labels could be predicted separately, by capturing label correlation we might achieve better predictive performance. In contrast with previous attempts in the literature that have modelled label correlations globally, this paper proposes a novel algorithm to model correlations and cluster labels locally. La-CovaC is a multi-label decision tree classifier that clusters labels into several dependent subsets at various points during training. The clusters are obtained locally by identifying the conditionally-dependent labels in localised regions of the feature space using the label correlation matrix. LaCovaC interleaves between two main decisions on the label matrix with training instances in rows and labels in columns: splitting this matrix vertically by partitioning the labels into subsets, or splitting it horizontally using features in the conventional way. Experiments on 13 benchmark datasets demonstrate that our proposal achieves competitive performance over a wide range of evaluation metrics when compared with the state-of-the-art multi-label classifiers.
展开▼