Research in Fine-Grained Visual Classification has focused on tackling thevariations in pose, lighting, and viewpoint using sophisticated localizationand segmentation techniques, and the usage of robust texture features toimprove performance. In this work, we look at the fundamental optimization ofneural network training for fine-grained classification tasks with minimalinter-class variance, and attempt to learn features with increasedgeneralization to prevent overfitting. We introduce Training-with-Confusion, anoptimization procedure for fine-grained classification tasks that regularizestraining by introducing confusion in activations. Our method can be generalizedto any fine-tuning task; it is robust to the presence of small training setsand label noise; and adds no overhead to the prediction time. We find thatTraining-with-Confusion improves the state-of-the-art on all major fine-grainedclassification datasets.
展开▼