首页> 外文期刊>IEEE Transactions on Image Processing >Feature Pyramid Reconfiguration With Consistent Loss for Object Detection
【24h】

Feature Pyramid Reconfiguration With Consistent Loss for Object Detection

机译:具有始终如一的损耗的特征金字塔重新配置,可用于物体检测

获取原文
获取原文并翻译 | 示例
           

摘要

Taking the feature pyramids into account has become a crucial way to boost the object detection performance. While various pyramid representations have been developed, previous works are still inefficient to integrate the semantical information over different scales. Moreover, recent object detectors are suffering from accurate object location applications, mainly due to the coarse definition of the positive examples at training and predicting phases. In this paper, we begin by analyzing current pyramid solutions, and then propose a novel architecture by reconfiguring the feature hierarchy in a flexible yet effective way. In particular, our architecture consists of two lightweight and trainable processes: global attention and local reconfiguration. The global attention is to emphasize the global information of each feature scale, while the local reconfiguration is to capture the local correlations across different scales. Both the global attention and local reconfiguration are non-linear and thus exhibit more expressive ability. Then, we discover that the loss function for object detectors during training is the central cause of the inaccurate location problem. We propose to address this issue by reshaping the standard cross entropy lass such that it focuses more on accurate predictions. Both the feature reconfiguration and the consistent loss could be utilized in popular one-stage (SSD, RetinaNet) and two-stage (Faster R-CNN) detection frameworks. Extensive experimental evaluations on PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO datasets demonstrate that our models achieve consistent and significant boosts compared with other state-of-the-art methods.
机译:考虑到特征金字塔已成为提高对象检测性能的关键方法。尽管已经开发出各种金字塔表示法,但先前的著作仍然无法有效地集成不同规模的语义信息。而且,最近的物体检测器正遭受精确的物体定位应用的困扰,这主要是由于在训练和预测阶段对阳性实例的粗略定义。在本文中,我们将从分析当前的金字塔解决方案开始,然后通过以灵活而有效的方式重新配置要素层次结构来提出一种新颖的体系结构。特别是,我们的体系结构由两个轻量级且可培训的过程组成:全球关注和本地重新配置。全球关注的重点是强调每个特征尺度的全局信息,而局部重配置则是捕获不同尺度之间的局部相关性。全局注意力和局部重新配置都是非线性的,因此表现出更多的表达能力。然后,我们发现训练过程中目标检测器的损失函数是定位问题不准确的主要原因。我们建议通过重塑标准互熵模型来解决此问题,使其更加关注准确的预测。特征重新配置和持续丢失都可以在流行的一级(SSD,RetinaNet)和二级(Faster R-CNN)检测框架中使用。在PASCAL VOC 2007,PASCAL VOC 2012和MS COCO数据集上进行的广泛实验评估表明,与其他最新方法相比,我们的模型获得了一致且显着的提升。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号