In this paper, we address the problem of person re-identification, whichrefers to associating the persons captured from different cameras. We propose asimple yet effective human part-aligned representation for handling the bodypart misalignment problem. Our approach decomposes the human body into regions(parts) which are discriminative for person matching, accordingly computes therepresentations over the regions, and aggregates the similarities computedbetween the corresponding regions of a pair of probe and gallery images as theoverall matching score. Our formulation, inspired by attention models, is adeep neural network modeling the three steps together, which is learnt throughminimizing the triplet loss function without requiring body part labelinginformation. Unlike most existing deep learning algorithms that learn a globalor spatial partition-based local representation, our approach performs humanbody partition, and thus is more robust to pose changes and various humanspatial distributions in the person bounding box. Our approach showsstate-of-the-art results over standard datasets, Market-$1501$, CUHK$03$,CUHK$01$ and VIPeR.
展开▼