Video registration for multimodal surveillance systems.

机译：多模式监视系统的视频注册。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Image registration approaches for an automatic multimodal video surveillance system are divided into two general approaches based on the range of captured scene: the approaches that are appropriate for long-range scenes, and the approaches that are suitable for close-range scenes. In the literature, this subject of research is not well documented, especially for close-range surveillance application domains. Our research is focused on novel image registration solutions for both close-range and long-range scenes featuring multiple humans. The proposed solutions are presented in the four articles included in this thesis. Our registration methods are applicable for further video analysis such as tracking, human localization, behavioral pattern analysis, and object categorization.;For far-range video surveillance, we propose an iterative system that consists of simultaneous thermal-visible video registration, sensor fusion, and people tracking. Our video registration is based on a RANSAC object trajectory matching, which estimates an affine transformation matrix to globally transform foreground objects of one image on another one. Our proposed multimodal surveillance system is based on a novel feedback scheme between registration and tracking modules that augments the performance of both modules iteratively over time. Our methods are designed for online applications and no camera calibration or special setup is required.;For close-range video surveillance applications, we introduce Local Self-Similarity (LSS) as a viable similarity measure for matching corresponding human body regions of thermal and visible images. We also demonstrate theoretically and quantitatively that LSS, as a thermal-visible similarity measure, is more robust to differences between corresponding regions' textures than the Mutual Information (MI), which is the classic multimodal similarity measure. Other viable local image descriptors including Histogram Of Gradient (HOG), Scale Invariant Feature Transform (SIFT), and Binary Robust Independent Elementary Feature (BRIEF) are also outperformed by LSS.;Moreover, we propose a LSS-based dense local stereo correspondence algorithm based on a voting approach, which estimates a dense disparity map for each foreground region in the image. The resulting disparity map can then be used to align the reference image on the second image. We demonstrate that our proposed LSS-based local registration method outperforms similar state-of-the-art MI-based local registration methods in the literature. Our experiments were carried out using realistic human monitoring scenarios in a close-range scene.;Due to the shortcomings of local stereo correspondence approaches for estimating accurate disparities in depth discontinuity regions, we propose a novel stereo correspondence method based on a global optimization approach. We introduce a stereo model appropriate for thermal-visible image registration using an energy minimization framework and Belief Propagation (BP) as a method to optimize the disparity assignment via an energy function. In this method, we integrated color and motion visual cues as a soft constraint into an energy function to improve disparity assignment accuracy in depth discontinuities. Although global correspondence approaches are computationally more expensive compared to Winner Take All (WTA) local correspondence approaches, the efficient BP algorithm and parallel processing programming (openMP) in C++ that we used in our implementation, speed up the processing time significantly and make our methods viable for video surveillance applications. Our methods are implemented in C++ using OpenCV library and object-oriented programming.;Our methods are designed to be integrated easily for further video analysis. In other words, the input data of our methods could come from two synchronized online video streams. For further analysis a new module could be added in our frame-by-frame algorithmic diagram. Further analysis might be object tracking, human localization, and trajectory pattern analysis for multimodal long-range monitoring applications, and behavior pattern analysis, object categorization, and tracking for close-range applications. (Abstract shortened by UMI.).

机译：基于捕获场景的范围，用于自动多模式视频监视系统的图像配准方法分为两种常规方法：适用于远程场景的方法和适用于近距离场景的方法。在文献中，没有很好地记录此研究主题，尤其是对于近距离监视应用领域。我们的研究专注于针对具有多个人物的近距离和远距离场景的新颖图像配准解决方案。本文包含的四篇文章介绍了提出的解决方案。我们的注册方法适用于进一步的视频分析，例如跟踪，人员定位，行为模式分析和对象分类。;对于远程视频监控，我们提出了一种迭代系统，该系统由同时热可见视频注册，传感器融合，和人们跟踪。我们的视频注册基于RANSAC对象轨迹匹配，该轨迹匹配估计仿射变换矩阵以将一个图像的前景对象全局变换到另一幅图像上。我们提出的多模式监视系统基于注册和跟踪模块之间的新颖反馈方案，该方案随着时间的推移不断增强两个模块的性能。我们的方法是专为在线应用而设计的，不需要摄像机校准或特殊设置。;对于近距离视频监控应用，我们引入了局部自相似性（LSS）作为可行的相似性度量，用于匹配人体的热区和可见区图片。我们还从理论上和定量上证明了，LSS作为一种热可见的相似性度量，比对应的经典多峰相似性度量互信息（MI）更能抵抗相应区域纹理之间的差异。 LSS还优于其他可行的局部图像描述符，包括梯度直方图（HOG），尺度不变特征变换（SIFT）和二进制鲁棒独立基本特征（BRIEF）。此外，我们提出了一种基于LSS的密集局部立体对应算法。基于投票方法，该方法估计图像中每个前景区域的密集视差图。然后可以将所得的视差图用于在第二图像上对准参考图像。我们证明了我们提出的基于LSS的本地注册方法优于文献中类似的基于MI的最新技术。我们的实验是在近距离场景中使用逼真的人类监视场景进行的；由于局部立体声对应方法在估计深度不连续区域中的准确视差方面的不足，我们提出了一种基于全局优化方法的新颖的立体声对应方法。我们介绍了一种适用于热可见图像配准的立体模型，该模型使用能量最小化框架和信念传播（BP）作为通过能量函数优化视差分配的方法。在这种方法中，我们将颜色和运动视觉提示作为软约束整合到能量函数中，以提高深度不连续性中的视差分配精度。尽管与Winner Take All（WTA）局部对应方法相比，全局对应方法在计算上更加昂贵，但是我们在实现中使用的高效的BP算法和C ++中的并行处理编程（openMP）极大地缩短了处理时间，并且使方法更加合理适用于视频监控应用。我们的方法使用OpenCV库和面向对象的编程在C ++中实现。;我们的方法旨在易于集成，以便进行进一步的视频分析。换句话说，我们方法的输入数据可能来自两个同步的在线视频流。为了进一步分析，可以在逐帧算法图中添加一个新模块。进一步的分析可能是针对多模式远程监视应用程序的对象跟踪，人员定位和轨迹模式分析，以及针对近距离应用程序的行为模式分析，对象分类和跟踪。（摘要由UMI缩短。）。

著录项

作者
Torabi, Atousa.;
展开▼
作者单位

Ecole Polytechnique, Montreal (Canada).;

展开▼
授予单位 Ecole Polytechnique, Montreal (Canada).;
学科 Engineering Computer.
学位 Ph.D.
年度 2012
页码 151 p.
总页数 151
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Mutual information based registration of multimodal stereo videos for person tracking [J] . Stephen J. Krotosky, Mohan M. Trivedi Computer vision and image understanding . 2007 ,第2a3期

机译：基于互信息的多模式立体声视频注册，以进行人跟踪
2. Structural Similarity-based Object Tracking In Multimodality Surveillance Videos [J] . Artur Loza, Lyudmila Mihaylova, David Bull, Machine Vision and Applications . 2009 ,第2期

机译：多模式监视视频中基于结构相似度的对象跟踪
3. Perspective registration and multi-frame super-resolution of license plates in surveillance videos [J] . Gabriele Guarnieri, Marco Fontani, Francesco Guzzi, Digital investigation . 2021 ,第Mara期

机译：监视视频中牌照的透视注册和多帧超级分辨率
4. Multimodal Tracking for Smart Videoconferencing and Video Surveillance [C] . Zotkin, D.N., Raykar, Computer Vision and Pattern Recognition (CVPR), 2007 IEEE Conference on . 2007

机译：用于智能视频会议和视频监控的多模式跟踪
5. Multimodal talker localization in video conferencing systems. [D] . Lo, Charn Leung (David). 2005

机译：视频会议系统中的多模式讲话者本地化。
6. Visualization techniques and graphical user interfaces in syndromic surveillance systems. Summary from the Disease Surveillance Workshop Sept. 11–12 2007; Bangkok Thailand [O] . Kieran M Moore, Graham Edge, Andrew R Kurc 2008

机译：症状监测系统中的可视化技术和图形用户界面。疾病监测研讨会的摘要2007年9月11日至12日；泰国曼谷
7. Structural similarity-based object tracking in multimodality surveillance videos [O] . Łoza, Artur, Mihaylova, Lyudmila, Bull, David, 2009

机译：多模式监控视频中基于结构相似度的对象跟踪
8. Pricing Approach for Mitigating Congestion in Multimodal Transportation Systems. Final Report to the Center for Multimodal Solutions for Congestion Mitigation (CMS) [R] . Lawphongpanich, S. 2010

机译：减轻多式联运系统拥挤的定价方法。针对拥堵缓解（Cms）的多模式解决方案中心的最终报告

Video registration for multimodal surveillance systems.

摘要

著录项

相似文献

相关主题

期刊订阅