首页> 外文OA文献 >Video indexing with combined tracking and object recognition for improved object understanding in scenes
【2h】

Video indexing with combined tracking and object recognition for improved object understanding in scenes

机译:具有组合跟踪和对象识别的视频索引,以改善场景中的对象理解

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Automatic understanding of video content is a problem which grows in importance every day. Video understanding algorithms require accuracy, robustness, speed, and scalability. Accuracy generates user confidence in usage. Robustness enables greater autonomy and reduced human intervention. Applications such as navigation and mapping demand real-time performance. Scalability is also important for maintaining high speed while expanding capacity to multiple users and sensors. In this thesis, I propose a "bag-of-phrases" model to improve the accuracy and robustness of the popular "bag-of-words" models. This model applies a "geometric grammar" to add structural constraints to the unordered "bag-of-words." I incorporate this model into an architecture which combines an object recognizer, a tracker, and a geolocation module. This architecture has the ability to use the complementarity of its components to compensate for its weaknesses. This allows for improvements in accuracy, robustness, and speed. Subsequently, I introduce VICTORIOUS, a fast implementation of the proposed architecture. Evaluation on computer-generated data as well as Caltech-101 indicate that this implementation is accurate, robust, and capable of performing in real time on current generation hardware. This implementation, together with the "bag-of-phrases" model and integrated architecture, forms a step towards meeting the requirements for an accurate, robust, real-time vision system.
机译:自动理解视频内容是一个日益重要的问题。视频理解算法需要准确性,鲁棒性,速度和可伸缩性。准确性使用户对使用产生信心。鲁棒性可以实现更大的自主权并减少人为干预。导航和地图绘制等应用程序需要实时性能。可扩展性对于保持高速同时将容量扩展到多个用户和传感器也很重要。在本文中,我提出了一种“词袋”模型,以提高流行的“词袋”模型的准确性和鲁棒性。该模型应用“几何语法”向无序的“单词袋”添加结构约束。我将此模型整合到一个架构中,该架构结合了对象识别器,跟踪器和地理位置模块。该体系结构能够使用其组件的互补性来弥补其弱点。这可以提高准确性,鲁棒性和速度。随后,我介绍了VICTORIOUS,这是提议的体系结构的快速实现。对计算机生成的数据以及Caltech-101的评估表明,该实现是准确,可靠的,并且能够在当前的硬件上实时执行。这种实现方式与“短语袋”模型和集成体系结构一起,为满足准确,强大,实时的视觉系统的要求迈出了一步。

著录项

  • 作者

    Xu Yuetian;

  • 作者单位
  • 年度 2009
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号