【24h】

Beyond SIFT for image classification

机译:超越SIFT进行图像分类

获取原文

摘要

In classifying images, scenes or objects, the most popular approach is based on the features extraction-coding-pooling framework allowing to generate discriminative and robust image representations from densely extracted local patches, mainly some SIFT/HOG ones. The majority of the latest research is focused on how to improve successfully these coding and pooling parts. In this work, we show that substantial improvements can be also obtained by coding information closer to the pixel values level in the same way that deep-learning architectures do. We introduce a two layer, stacked, coder-pooler architecture where the first layer is specifically dedicated to extract, from our so-called Differential Vectors (DV) patches, some efficient, local low-level features more discriminative and efficient that their classic handcrafted counterpart. This first layer can advantageously replace any classic dense SIFT/HOG patches extraction stage. We demonstrate the effectiveness of our approach on three datasets: UIUC-Sports, Scene 15 and Caltech 101. We achieve excellent performances with simple linear classification while using basic coding and pooling schemes for both layers, i.e. Sparse Coding (SC) and Max-Pooling (MP) respectively.
机译:在对图像,场景或对象进行分类时,最流行的方法是基于特征提取-编码-合并框架,该框架允许从密集提取的局部斑块(主要是一些SIFT / HOG斑块)中生成具有判别力和鲁棒性的图像表示形式。最新研究的大部分集中在如何成功地改进这些编码和合并部分上。在这项工作中,我们表明,以与深度学习体系结构相同的方式,通过编码更接近像素值级别的信息也可以获得实质性的改进。我们引入了两层堆叠的编码器-池结构,其中第一层专门用于从所谓的“差分矢量”(DV)补丁中提取一些比传统的手工制作更具歧视性和效率的高效本地低层功能。对应。该第一层可以有利地代替任何经典的密集SIFT / HOG补丁提取阶段。我们在三个数据集上展示了我们的方法的有效性:UIUC-Sports,Scene 15和Caltech101。我们通过简单的线性分类实现了出色的性能,同时在两个层上都使用了基本的编码和池化方案,即稀疏编码(SC)和最大池化(MP)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号