Many real-world applications involve learning in the presence of multiple labels. For example, in the case of images, a single image may be labeled sky, cloud, or even flower. To make matters more complicated, the dataset for training may have missing labels. The challenge, then, is to learn to (multi)label items even in the presence of missing labels. In many cases, using weakly labeled data may degrade performance. It is thus desirable to have a method that does not degrade learning. This paper presents the SafeML method, an algorithm that addresses the issue. There are in fact two algorithms given. One is directed toward the evaluation of performance through the F_1 score, which trades off precision and recall, and the other through top-k precision. Both algorithms are formulated as zero-sum games that use only an active set of constraints. Both use linear programming for the iterative improvement of the predictor's label matrix, so both algorithms are efficient.
展开▼