Object tracking and classification in infrared videos are challenging due to large variations in illumination, target sizes,and target orientations. Moreover, if the infrared videos only generate compressive measurements, then it will be evenmore difficult to perform target tracking and classification directly in the compressive measurement domain, as manyconventional trackers and classifiers can only handle reconstructed frames from compressive measurements. This papersummarizes our research effort on target tracking and classification directly in the compressive measurement domain.We focus on one special type of compressive measurement using pixel subsampling. That is, the original pixels in thevideo frames are randomly subsampled. Even in such special compressive sensing setting, conventional trackers do notwork in a satisfactory manner. We propose a deep learning approach that integrates YOLO (You Only Look Once) andResNet (residual network) for multiple target tracking and classification. YOLO is used for multiple target tracking andResNet is for target classification. Extensive experiments using short wave infrared (SWIR) videos demonstrated theefficacy of the proposed approach even though the training data are very scarce.
展开▼