Object detection is a fundamental yet challenging problem in natural scenes and aerial scenes. Although region baseddeep convolutional neural networks (CNNs) have brought impressive improvements for object detection in naturalscenes, detecting oriented objects in aerial images still remains challenging, due to the complexity of the aerial imagebackgrounds and the large degree of freedom in scale, orientation, and density. To tackle these problems, we propose anovel network, composed of backbone structure with global attention module, multi-scale object proposal network andfinal oriented object detector, which can efficiently detect small objects, arbitrary direction objects, and dense objects inaerial images. We utilize pyramid pooling blocks as a global attention module on the top of the backbone structure togenerate discriminative feature representations, which provide diverse context information and complementary receptivefield for the detector. The global attention module can help the model reduce false alarms and incorrect classifications inthe complex aerial image backgrounds. The multi-scale object proposal network aims to generate object-like regions atdifferent scales through several intermediate layers. After that, these regions are sent to the detector for refinedclassification and regression, which can alleviate the problem of variant scales in aerial images. The oriented objectdetector is designed to generate predictions for inclined box. The quantitative comparison results on the challengingDOTA dataset show that our proposed method is more accurate than baseline algorithms and is effective for objectiondetection in aerial images. The results demonstrate that the proposed method significantly improves the performance.
展开▼