We present a new loss function, namely Wing loss, for robust facial landmarklocalisation with Convolutional Neural Networks (CNNs). We first compare andanalyse different objective functions and show that the L1 and smooth L1 lossfunctions perform much better than the widely used L2 loss function in faciallandmark localisation. The analysis of these loss functions suggests that, forthe training of a CNN-based localisation model, more attention should be paidto small and medium range errors. To this end, we design a piece-wise lossfunction. The new loss function amplifies the impact of errors from theinterval (-w,w) by switching from L1 loss to a modified logarithm function. To address the problem of under-representation of samples with largeout-of-plane head rotations in the training set, we propose a simple buteffective boosting strategy, referred to as Hard Sample Mining (HSM). Inparticular, we deal with the data imbalance problem by duplicating the minoritytraining samples and perturbing them by injecting random image rotation,bounding box translation and other data augmentation approaches. Last, theproposed approach is extended to create a two-stage localisation framework forrobust facial landmark localisation in the wild. The experimental resultsobtained on the AFLW and 300W datasets demonstrate the merits of the Wing lossfunction, and prove the superiority of the proposed method over thestate-of-the-art approaches.
展开▼