首页> 外文会议>European conference on computer vision >SSD: Single Shot MultiBox Detector
【24h】

SSD: Single Shot MultiBox Detector

机译:SSD:单发MultiBox检测器

获取原文

摘要

We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape. Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. SSD is simple relative to methods that require object proposals because it completely eliminates proposal generation and subsequent pixel or feature resampling stages and encapsulates all computation in a single network. This makes SSD easy to train and straightforward to integrate into systems that require a detection component. Experimental results on the PASCAL VOC, COCO, and ILSVRC datasets confirm that SSD has competitive accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference. For 300 × 300 input, SSD achieves 74.3 % mAP on VOC2007 test at 59 FPS on a Nvidia Titan X and for 512 × 512 input, SSD achieves 76.9 % mAP, outperforming a comparable state of the art Faster R-CNN model. Compared to other single stage methods, SSD has much better accuracy even with a smaller input image size.
机译:我们提出了一种使用单个深度神经网络检测图像中对象的方法。我们的名为SSD的方法将边界框的输出空间离散化为一组默认框,这些默认框具有不同的长宽比和每个要素图位置的比例。在预测时,网络会为每个默认框中的每个对象类别的存在生成分数,并对该框进行调整以更好地匹配对象形状。此外,该网络将来自具有不同分辨率的多个特征图的预测进行组合,以自然地处理各种大小的对象。相对于需要对象建议的方法,SSD简单,因为它完全消除了建议生成和后续的像素或特征重采样阶段,并将所有计算封装在一个网络中。这使得SSD易于培训,并且可以直接集成到需要检测组件的系统中。在PASCAL VOC,COCO和ILSVRC数据集上的实验结果证实,SSD与采用附加对象建议步骤的方法相比具有更高的准确性,并且速度更快,同时为训练和推理提供了统一的框架。对于300×300输入,SSD在Nvidia Titan X上以59 FPS的速度在VOC2007测试上达到了74.3%的mAP,而对于512×512输入,SSD则达到了76.9%的mAP,优于同类的Faster R-CNN模型。与其他单阶段方法相比,即使输入图像尺寸较小,SSD的精度也要好得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号