首页> 外文OA文献 >Recognizing visual object categories with subspace methods and a learned hierarchical shape vocabulary
【2h】

Recognizing visual object categories with subspace methods and a learned hierarchical shape vocabulary

机译:使用子空间方法和学习的层次形状词汇识别视觉对象类别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The topic of the thesis is visual object class recognition and detection in images. In the first part of the thesis, we developed an approach that combines reconstructive and discriminative subspaceudmethods for robust object classification. In the second part, we developed a framework for learning of a hierarchical compositional shape vocabulary for representing multiple object classes and detecting them in images.ududLinear subspace methods that provide sufficient reconstruction of the data such as PCA (Principal Component Analysis) offer an efficient way of dealing with missing pixels, outliers, and occlusions that often appear in images. Discriminative methods, such as LDA (Linear Discriminant Analysis) and CCA (Canonical Component Analysis), which on the other hand, are better suited for classification and regression tasks, are highly sensitive to corrupted data. If an image in the test phase contains outliers (e.g. an object in an image is partly occluded), discriminative methods are likely to assign it to the wrong class. In this thesis, we propose an approach that combines discriminative and reconstructive methods in a way that enables near-to-perfect classification performance also in the case when objects during testing time are partly occluded. The idea behind the proposed approach is to augment the subspace basis given by a discriminative approach with a small set of additional basis vectors computed by a reconstructive method. In the space spanned by the augmented basis, we are able to detect and remove outlying pixels using a robust subsampling scheme and classify images based on the inliers. The proposed approach is thus capable of robust classification/regression with a high break-down point. The theoretical results are demonstrated on several computer vision tasks showing that the proposed approach significantly outperforms the standard discriminative methods in the case of missing pixels and images containing occlusions and outliers.ududIn the second and main part of the thesis, we will present a novel hierarchical framework for representing, learning and detecting object classes in images. Hierarchies are important, because they allow feature sharing between objects at multiple levels of representation, lead to better generalization within and across object classes, are able to code exponential variability in a very compact way and enable fast inference. This makes them potentially suitable for learning and recognizing a higher number of object classes. However, the success of the hierarchical approaches so far has been hindered by the use of hand-crafted features or predetermined grouping rules.ududIn this thesis, we present a novel framework for learning a hierarchical compositional shape vocabulary for representing multiple object classes. The approach takes simple contour fragmentsudand learns their frequent spatial configurations. These are recursively combined into increasingly more complex and class-specific shape compositions, each exerting a high degree of shape variability. At the top-level of the vocabulary, the compositions are sufficiently large and complex to represent the whole shapes of the objects. We learn the vocabulary layer after layer, by gradually increasing the size of the window of analysis and reducing the spatial resolution at which the shape configurations are learned. Compositions are formed by first learning spatial relations between pairs of parts (features from the previous layer) and then learning their frequent higher-order co-occurrences. The lower layers are learned jointly on images of all classes, whereas the higher layers of the vocabulary are learned incrementally, by presenting the algorithm with one object class after another. The experimental results show that the learned multi-class object representation scales favorably with the number of object classes and achieves a state-of-the-art detection performance at both, faster inference as well as shorter training times. Additionally, the learned multi-class object representation is very compact, needing only a few megabytes when stored on a computer disk. We also demonstrate the usefulness of the features learned in the intermediate layers of the hierarchy for object classification.udud
机译:论文的主题是图像中视觉目标的识别和检测。在本文的第一部分中,我们开发了一种将重构子空间和区分子空间 udmethods相结合的方法,用于鲁棒的对象分类。在第二部分中,我们开发了一个框架,用于学习用于表示多个对象类别并在图像中检测它们的分层组成形状词汇。 ud ud提供足够数据重构的线性子空间方法,例如PCA(主成分分析)提供处理图像中经常出现的像素丢失,离群值和遮挡的有效方法。另一方面,LDA(线性判别分析)和CCA(规范成分分析)等判别方法更适合分类和回归任务,它们对损坏的数据高度敏感。如果测试阶段的图片包含异常值(例如,图片中的某个对象被部分遮挡),则判别方法可能会将其分配给错误的类别。在本文中,我们提出了一种将判别和重构方法相结合的方法,即使在测试期间部分遮挡对象的情况下,也能实现近乎完美的分类性能。所提出的方法背后的思想是,用判别方法给出的子空间基础,用一小部分由重构方法计算出的附加基础矢量来扩展。在扩展基础所跨越的空间中,我们能够使用鲁棒的子采样方案检测并移除离边像素,并基于像素对图像进行分类。因此,所提出的方法能够以高分解点进行稳健的分类/回归。理论结果在若干计算机视觉任务上得到了证明,表明在缺少包含遮挡和离群值的像素和图像的情况下,该方法明显优于标准判别方法。 ud ud一种新颖的层次结构框架,用于表示,学习和检测图像中的对象类别。层次结构很重要,因为它们允许在多个表示级别的对象之间共享特征,从而导致对象类内部和对象之间更好的概括,并能够以非常紧凑的方式编码指数可变性并实现快速推断。这使得它们潜在地适合于学习和识别更多数量的对象类别。但是,到目前为止,分层方法的成功受到手工特征或预定分组规则的使用的阻碍。 ud ud在本文中,我们提出了一种新颖的框架,用于学习用于表示多个对象类别的分层组成形状词汇。该方法采用简单的轮廓片段 ud,并了解其频繁的空间配置。将这些递归组合为越来越复杂和特定于类别的形状组合,每个组合都发挥高度的形状可变性。在词汇表的最高层次上,成分足够大且复杂,无法代表对象的整体形状。通过逐步增加分析窗口的大小并降低学习形状配置的空间分辨率,我们可以逐层学习词汇。通过首先学习成对的零件之间的空间关系(来自上一层的特征),然后学习其频繁的高阶共现,来形成构图。通过在所有对象类别的图像上共同学习较低层,而通过在一个对象类别与另一个对象类别之间进行介绍,逐步学习词汇的较高层。实验结果表明,所学习的多类对象表示与对象类的数量成比例地缩放,并且在更快的推理和更短的训练时间上均达到了最新的检测性能。另外,学习的多类对象表示非常紧凑,存储在计算机磁盘上时仅需要几兆字节。我们还演示了在层次结构的中间层中学习到的功能对于对象分类的有用性。

著录项

  • 作者

    Fidler Sanja;

  • 作者单位
  • 年度 2010
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号