Object based Scene Representations using Fisher Scores of Local Subspace Projections

机译：基于对象的场景表示，使用Fisher分数的本地子空间投影

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Several works have shown that deep CNNs can be easily transferred across datasets, e.g. the transfer from object recognition on ImageNet to object detection on Pascal VOC. Less clear, however, is the ability of CNNs to transfer knowledge across tasks. A common example of such transfer is the problem of scene classification, that should leverage localized object detections to recognize holistic visual concepts. While this problems is currently addressed with Fisher vector representations, these are now shown ineffective for the high-dimensional and highly non-linear features extracted by modern CNNs. It is argued that this is mostly due to the reliance on a model, the Gaussian mixture of diagonal covariances, which has a very limited ability to capture the second order statistics of CNN features. This problem is addressed by the adoption of a better model, the mixture of factor analyzers (MFA), which approximates the non-linear data manifold by a collection of local sub-spaces. The Fisher score with respect to the MFA (MFA-FS) is derived and proposed as an image representation for holistic image classifiers. Extensive experiments show that the MFA-FS has state of the art performance for object-to-scene transfer and this transfer actually outperforms the training of a scene CNN from a large scene dataset. The two representations are also shown to be complementary, in the sense that their combination outperforms each of the representations by itself. When combined, they produce a state-of-the-art scene classifier.

机译：有几项工作表明，可以在数据集中容易地传输深CNN，例如，从想象人的对象识别转移到Pascal VOC上的对象检测。然而，不太明确的是CNNS在跨任务转移知识的能力。这种转移的常见示例是场景分类问题，应该利用本地化对象检测来识别整体视觉概念。虽然此问题目前正在通过Fisher Vector表示，但现在显示出由现代CNN提取的高维和高度非线性特征无效。有人认为，这主要是由于对对角线协方差的高斯混合的依赖于模型，这具有非常有限的能力捕获CNN特征的二阶统计数据。通过采用更好的模型，因子分析仪（MFA）的混合来解决该问题，其通过集收集本地子空间来实现非线性数据流形的混合。导出和提出关于MFA（MFA-FS）的Fisher评分作为全部图像分类器的图像表示。广泛的实验表明，MFA-FS具有用于场景对象传输的最先义性能，并且该转移实际上优于来自大场景数据集的场景CNN的训练。两个表示也被认为是互补的，因此它们的组合本身优于每个表示。组合时，它们会产生最先进的场景分类器。

著录项

来源
《Annual conference on Neural Information Processing Systems》|2016年|p. 2163-2899|共9页
会议地点
作者
Mandar Dixit; Nuno Vasconcelos;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Reconstruction of three-dimensional localized objects from limited angle x-ray projections: an approach based on sparsity and multigrid image representation [J] . Charles Soussen, Jerome Idier Journal of electronic imaging . 2008,第3期

机译：从有限角度X射线投影重建三维局部对象：一种基于稀疏性和多网格图像表示的方法
2. Semantic Fisher Scores for Task Transfer: Using Objects to Classify Scenes [J] . Dixit Mandar, Li Yunsheng, Vasconcelos Nuno IEEE Transactions on Pattern Analysis and Machine Intelligence . 2020,第12期

机译：任务传输的语义fisher分数：使用对象来对场景进行分类
3. Interaction envelope: Local spatial representations of objects at all scales in scene-selective regions [J] . NeuroImage . 2015,第Null期

机译：交互包络：场景选择区域中所有比例的对象的局部空间表示
4. Object based Scene Representations using Fisher Scores of Local Subspace Projections [C] . Mandar Dixit, Nuno Vasconcelos Annual conference on Neural Information Processing Systems . 2016

机译：使用局部子空间投影的Fisher分数的基于对象的场景表示
5. Tracking of moving objects in scenery by subspace projection using independent component analysis. [D] . Noe, Brian Joseph. 2001

机译：使用独立分量分析通过子空间投影跟踪风景中的运动对象。
6. Interaction envelope: Local spatial representations of objects at all scales in scene-selective regions [O] . Wilma Alice Bainbridge, Aude Oliva -1

机译：交互包络：场景选择区域中所有比例的对象的局部空间表示
7. Interaction envelope: Local spatial representations of objects at all scales in scene-selective regions [O] . Bainbridge Wilma Alice, Oliva Aude 2015

机译：交互包络：场景选择区域中所有比例的对象的局部空间表示
8. On-line object feature extraction for multispectral scene representation [R] . Ghassemian, Hassan, Landgrebe, David 1988

机译：多光谱场景表示的在线对象特征提取

Object based Scene Representations using Fisher Scores of Local Subspace Projections

摘要

著录项

相似文献

相关主题

期刊订阅