首页> 外文学位 >Human re-identification in real-world surveillance camera networks.
【24h】

Human re-identification in real-world surveillance camera networks.

机译:现实监控摄像机网络中的人员重新识别。

获取原文
获取原文并翻译 | 示例

摘要

Video surveillance has become critical for security applications. With cameras and data storage devices getting more affordable, many institutions and organizations have chosen to install camera networks for safety and surveillance. The U.S. Department of Homeland Security, for instance, has directed a huge amount of manpower and expenditure to the installation, maintenance, replacement and operation of surveillance camera networks. Public transportation centers such as airports, train stations, and bus stops are some of the most concentrated environments. The traditional monitor- ing method, which completely relies on security officers' observations, becomes less feasible when more and more screens need to be watched at the same time. Instead, video analytic solutions that process multiple cameras simultaneously are more reliable. In this thesis, we focus on one particular application, human re-identification, with a focus on the challenges of real-world scenarios.;First, we propose a viewpoint invariant human re-identification framework that considers pose information and person-specific discriminative features. We observe that appearance consistency at different spacial locations varies with the object's pose. A pixel-wise weighting scheme is learned to provide the local robustness to varying viewpoint. For the appearance model matching, instead of a universal distance metric, we further learn a discriminative weighting scheme for a given person. Experimental results show that the proposed method boosts the performance of various state-of- the-art metric learning approaches.;Next, we introduce an efficient metric learning algorithm for the multi-shot re- identification problem based on a combination of random projections and random forests. To obtain a high matching rate, re-identification algorithms usually employ feature vectors with high dimensionality. This severely affects the computational efficiency in metric learning and matching. We use random projections to transfer all the calculations into a very small subspace and boost the performance by accumulating the metrics learned from random forests in each subspace. With the randomness brought by both techniques, we substantially increase the diversity of training data samples and produce a robust ensemble model.;Third, we propose a multi-shot human re-identification framework to address the issue of effectively using image sequences from tracking results. Because of the multi-modal nature of the feature data distribution with respect to different cameras, Local Fisher Discriminant Analysis (LFDA) is particularly suitable to provide a sub- space where feature data from different people are maximally separated while the local structure of each class is still preserved. A clustering step is adopted to further elim- inate difficulties introduced by the multimodality of the feature data from the same image sequence, so that LFDA is able to better separate the classes. The relationship between different cameras is established by a metric learning step. This algorithm models the appearance characteristic of each person directly from the tracking results. It is very efficient because an analytic solution is achievable for the dimensionality reduction step and metric learning is performed in a lower dimensional space.;Finally, we introduce an end-to-end human re-identification solution installed in a mid-size U.S airport. Designing and building a fully functional human re- identification software embedded into a real-world surveillance system involves many challenges that are not encountered in typical re-identification research. For instance, one needs to implement essential supporting modules such as video streaming, hu- man detection and tracking, as well as a user friendly interface. Moreover, it is critical for the system to run in real-time so that it can handle live re-identification requests. Real-world issues such as complicated illumination conditions, low resolution images and crowded scenes should also be taken into consideration. We describe the high- level system design and the algorithm framework of our re-identification solution. We discuss above challenges and trade-offs, as well as initial results that show the perfor- mance of the algorithms in the real-world surveillance task.
机译:视频监控对于安全应用已变得至关重要。随着摄像机和数据存储设备的价格越来越便宜,许多机构和组织都选择安装摄像机网络以进行安全和监视。例如,美国国土安全部已将大量的人力和财力用于监视摄像机网络的安装,维护,更换和操作。机场,火车站和公交车站等公共交通中心是最集中的环境。当需要同时观看越来越多的屏幕时,完全依靠安全人员的观察结果的传统监视方法变得不可行。相反,同时处理多台摄像机的视频分析解决方案更加可靠。在本文中,我们将重点放在一个特定的应用程序上,即人类重新识别,着重于现实场景中的挑战。首先,我们提出一种观点不变的人类重新识别框架,该框架考虑了姿势信息和特定于人的判别力特征。我们观察到,在不同空间位置的外观一致性随对象的姿势而变化。学习了逐像素加权方案以为变化的视点提供局部鲁棒性。对于外观模型匹配,而不是通用距离度量,我们进一步学习了给定人的判别加权方案。实验结果表明,该方法提高了各种最新的度量学习方法的性能。接下来,我们介绍了一种基于随机投影和随机森林。为了获得高匹配率,重识别算法通常采用具有高维数的特征向量。这严重影响了度量学习和匹配中的计算效率。我们使用随机投影将所有计算转移到一个很小的子空间中,并通过累积从每个子空间的随机森林中获悉的指标来提高性能。结合这两种技术带来的随机性,我们大大增加了训练数据样本的多样性,并生成了健壮的集成模型。第三,我们提出了多镜头人类重新识别框架,以解决有效使用跟踪结果中的图像序列的问题。 。由于特征数据分布针对不同相机的多模式性质,本地Fisher判别分析(LFDA)特别适合提供一个子空间,在该子空间中,最大程度地分离了来自不同人的特征数据,而每个类别的局部结构仍然保留。采用聚类步骤可以进一步消除同一图像序列中特征数据的多模态性带来的困难,从而使LFDA能够更好地分离类别。通过度量学习步骤来建立不同摄像机之间的关系。该算法直接根据跟踪结果对每个人的外观特征进行建模。这是非常有效的,因为可以实现降维步骤的解析解决方案,并且可以在较低维度的空间中执行度量学习。最后,我们引入了安装在美国中型机场中的端到端人员重新识别解决方案。设计和构建嵌入到现实世界的监视系统中的功能齐全的人员重新识别软件涉及许多在典型的重新识别研究中未遇到的挑战。例如,需要实现一些必要的支持模块,例如视频流,人的检测和跟踪以及用户友好的界面。此外,对于系统实时运行至关重要,这样它才能处理实时重新识别请求。还应考虑现实世界中的问题,例如复杂的照明条件,低分辨率图像和拥挤的场景。我们描述了我们的重新识别解决方案的高级系统设计和算法框架。我们讨论了上述挑战和折衷方案,以及初步结果,这些结果表明了算法在实际监视任务中的性能。

著录项

  • 作者

    Li, Yang.;

  • 作者单位

    Rensselaer Polytechnic Institute.;

  • 授予单位 Rensselaer Polytechnic Institute.;
  • 学科 Electrical engineering.;Computer science.;Engineering.
  • 学位 Ph.D.
  • 年度 2015
  • 页码 150 p.
  • 总页数 150
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:52:20

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号