...
首页> 外文期刊>ACM Transactions on Applied Perception (TAP) >Assessing Neural Network Scene Classification from Degraded Images
【24h】

Assessing Neural Network Scene Classification from Degraded Images

机译:从退化图像评估神经网络场景分类

获取原文
获取原文并翻译 | 示例

摘要

Scene recognition is an essential component of both machine and biological vision. Recent advances in computer vision using deep convolutional neural networks (CNNs) have demonstrated impressive sophistication in scene recognition, through training on large datasets of labeled scene images (Zhou et al. 2018, 2014). One criticism of CNN-based approaches is that performance may not generalize well beyond the training image set (Torralba and Efros 2011), and may be hampered by minor image modifications, which in some cases are barely perceptible to the human eye (Goodfellow et al. 2015; Szegedy et al. 2013). While these "adversarial examples" may be unlikely in natural contexts, during many real-world visual tasks scene information can be degraded or limited due to defocus blur, camera motion, sensor noise, or occluding objects. Here, we quantify the impact of several image degradations (some common, and some more exotic) on indoor/outdoor scene classification using CNNs. For comparison, we use human observers as a benchmark, and also evaluate performance against classifiers using limited, manually selected descriptors. While the CNNs outperformed the other classifiers and rivaled human accuracy for intact images, our results show that their classification accuracy is more affected by image degradations than human observers. On a practical level, however, accuracy of the CNNs remained well above chance for a wide range of image manipulations that disrupted both local and global image statistics. We also examine the level of image-by-image agreement with human observers, and find that the CNNs' agreement with observers varied as a function of the nature of image manipulation. In many cases, this agreement was not substantially different from the level one would expect to observe for two independent classifiers. Together, these results suggest that CNN-based scene classification techniques are relatively robust to several image degradations. However, the pattern of classifications obtained for ambiguous images does not appear to closely reflect the strategies employed by human observers.
机译:场景识别是机器视觉和生物视觉的重要组成部分。使用深度卷积神经网络(CNN)在计算机视觉方面的最新进展已经证明,通过对带有标签的场景图像的大型数据集进行训练,场景识别方面的先进技术令人印象深刻(Zhou等人,2018,2014)。对基于CNN的方法的一种批评是,性能可能无法在训练图像集之外得到很好的推广(Torralba和Efros,2011年),并且可能会受到少量图像修改的阻碍,在某些情况下,人眼几乎无法察觉(Goodfellow等人) (2015年; Szegedy等人,2013年)。尽管这些“对抗性示例”在自然环境中不太可能出现,但在许多现实世界中的视觉任务中,由于散焦模糊,相机运动,传感器噪声或物体被遮挡,场景信息可能会退化或受到限制。在这里,我们量化了使用CNN进行室内/室外场景分类时,几种图像质量下降的影响(一些常见的图像,还有一些更奇异的图像)。为了进行比较,我们使用人类观察者作为基准,并使用有限的手动选择的描述符对照分类器评估性能。尽管CNN的性能优于其他分类器,并且在完整图像方面可以与人类精度媲美,但我们的结果表明,与人类观察者相比,其分类精度受图像降级的影响更大。但是,从实际的角度来看,CNN的准确性仍然远远超出了破坏本地和全局图像统计数据的各种图像处理的机会。我们还检查了与人类观察者的逐张图像协议的水平,发现CNN与观察者的协议随图像操作的性质而变化。在许多情况下,该协议与两个独立分类器的预期协议没有实质性差异。在一起,这些结果表明,基于CNN的场景分类技术对几种图像退化都具有较强的鲁棒性。但是,对于模棱两可的图像获得的分类模式似乎并不紧密反映人类观察者采用的策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号