首页> 外文OA文献 >Video indexing with combined tracking and object recognition for improved object understanding in scenes

【2h】

Video indexing with combined tracking and object recognition for improved object understanding in scenes

机译：具有组合跟踪和对象识别的视频索引，以改善场景中的对象理解

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Automatic understanding of video content is a problem which grows in importance every day. Video understanding algorithms require accuracy, robustness, speed, and scalability. Accuracy generates user confidence in usage. Robustness enables greater autonomy and reduced human intervention. Applications such as navigation and mapping demand real-time performance. Scalability is also important for maintaining high speed while expanding capacity to multiple users and sensors. In this thesis, I propose a "bag-of-phrases" model to improve the accuracy and robustness of the popular "bag-of-words" models. This model applies a "geometric grammar" to add structural constraints to the unordered "bag-of-words." I incorporate this model into an architecture which combines an object recognizer, a tracker, and a geolocation module. This architecture has the ability to use the complementarity of its components to compensate for its weaknesses. This allows for improvements in accuracy, robustness, and speed. Subsequently, I introduce VICTORIOUS, a fast implementation of the proposed architecture. Evaluation on computer-generated data as well as Caltech-101 indicate that this implementation is accurate, robust, and capable of performing in real time on current generation hardware. This implementation, together with the "bag-of-phrases" model and integrated architecture, forms a step towards meeting the requirements for an accurate, robust, real-time vision system.

机译：自动理解视频内容是一个日益重要的问题。视频理解算法需要准确性，鲁棒性，速度和可伸缩性。准确性使用户对使用产生信心。鲁棒性可以实现更大的自主权并减少人为干预。导航和地图绘制等应用程序需要实时性能。可扩展性对于保持高速同时将容量扩展到多个用户和传感器也很重要。在本文中，我提出了一种“词袋”模型，以提高流行的“词袋”模型的准确性和鲁棒性。该模型应用“几何语法”向无序的“单词袋”添加结构约束。我将此模型整合到一个架构中，该架构结合了对象识别器，跟踪器和地理位置模块。该体系结构能够使用其组件的互补性来弥补其弱点。这可以提高准确性，鲁棒性和速度。随后，我介绍了VICTORIOUS，这是提议的体系结构的快速实现。对计算机生成的数据以及Caltech-101的评估表明，该实现是准确，可靠的，并且能够在当前的硬件上实时执行。这种实现方式与“短语袋”模型和集成体系结构一起，为满足准确，强大，实时的视觉系统的要求迈出了一步。

著录项

作者
Xu Yuetian;
展开▼
作者单位

展开▼
年度 2009
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. GEO-BASED AERIAL SURVEILLANCE VIDEO PROCESSING FOR SCENE UNDERSTANDING AND OBJECT TRACKING [J] . JIANGJIAN XIAO, HUI CHENG, FENG HAN, International Journal of Pattern Recognition and Artificial Intelligence . 2009,第7期

机译：基于地理的航空监视视频处理，用于场景理解和对象跟踪
2. 3D Scene Recovery based on Multiple Objects Tracking in Sport Videos [J] . Shihe Tian, Ming Huang, Yang Liu, International Journal of Performability Engineering . 2018,第3期

机译：基于多个对象的3D场景恢复跟踪在体育视频中
3. 3-D Object Tracking in Panoramic Video and LiDAR for Radiological Source–Object Attribution and Improved Source Detection [J] . Marshall M. R., Hellfeld D., Joshi T. H. Y., IEEE Transactions on Nuclear Science . 2021,第2期

机译：全景视频和LIDAR的3-D对象跟踪用于放射源 - 对象归因和改进的源检测
4. Geo-spatial Aerial Video Processing for Scene Understanding and Object Tracking [C] . Jiangjian Xiao, Hui Cheng, Feng Han, IEEE Conference on Computer Vision and Pattern Recognition . 2008

机译：场景理解和对象跟踪的地质空间空中视频处理
5. Combining object recognition and tracking for augmented reality. [D] . Mooser, Jonathan. 2009

机译：结合对象识别和跟踪以增强现实。
6. Using an Improved SIFT Algorithm and Fuzzy Closed-Loop Control Strategy for Object Recognition in Cluttered Scenes [O] . Nie Haitao, Long Kehui, Ma Jun, -1

机译：改进SIFT算法和模糊闭环控制策略在杂物场景下的目标识别
7. Object Tracking and Task Recognition for Producing Interactive Video Content — Semi-automatic indexing for QUEVICO [O] . Motoyuki Ozeki, Masatsugu Itoh, Hidekatsu Izuno, 2008

机译：生产交互式视频内容的对象跟踪和任务识别— QUEVICO的半自动索引

Video indexing with combined tracking and object recognition for improved object understanding in scenes

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅