首页> 外文会议>Annual conference on Neural Information Processing Systems >Object based Scene Representations using Fisher Scores of Local Subspace Projections
【24h】

Object based Scene Representations using Fisher Scores of Local Subspace Projections

机译:基于对象的场景表示,使用Fisher分数的本地子空间投影

获取原文

摘要

Several works have shown that deep CNNs can be easily transferred across datasets, e.g. the transfer from object recognition on ImageNet to object detection on Pascal VOC. Less clear, however, is the ability of CNNs to transfer knowledge across tasks. A common example of such transfer is the problem of scene classification, that should leverage localized object detections to recognize holistic visual concepts. While this problems is currently addressed with Fisher vector representations, these are now shown ineffective for the high-dimensional and highly non-linear features extracted by modern CNNs. It is argued that this is mostly due to the reliance on a model, the Gaussian mixture of diagonal covariances, which has a very limited ability to capture the second order statistics of CNN features. This problem is addressed by the adoption of a better model, the mixture of factor analyzers (MFA), which approximates the non-linear data manifold by a collection of local sub-spaces. The Fisher score with respect to the MFA (MFA-FS) is derived and proposed as an image representation for holistic image classifiers. Extensive experiments show that the MFA-FS has state of the art performance for object-to-scene transfer and this transfer actually outperforms the training of a scene CNN from a large scene dataset. The two representations are also shown to be complementary, in the sense that their combination outperforms each of the representations by itself. When combined, they produce a state-of-the-art scene classifier.
机译:有几项工作表明,可以在数据集中容易地传输深CNN,例如,从想象人的对象识别转移到Pascal VOC上的对象检测。然而,不太明确的是CNNS在跨任务转移知识的能力。这种转移的常见示例是场景分类问题,应该利用本地化对象检测来识别整体视觉概念。虽然此问题目前正在通过Fisher Vector表示,但现在显示出由现代CNN提取的高维和高度非线性特征无效。有人认为,这主要是由于对对角线协方差的高斯混合的依赖于模型,这具有非常有限的能力捕获CNN特征的二阶统计数据。通过采用更好的模型,因子分析仪(MFA)的混合来解决该问题,其通过集收集本地子空间来实现非线性数据流形的混合。导出和提出关于MFA(MFA-FS)的Fisher评分作为全部图像分类器的图像表示。广泛的实验表明,MFA-FS具有用于场景对象传输的最先义性能,并且该转移实际上优于来自大场景数据集的场景CNN的训练。两个表示也被认为是互补的,因此它们的组合本身优于每个表示。组合时,它们会产生最先进的场景分类器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号