...
首页> 外文期刊>Extremes >An attentive convolutional transformer-based network for road safety
【24h】

An attentive convolutional transformer-based network for road safety

机译:

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The sharp surge in the number of vehicles on the road leads to numerous traffic violations. The detection of traffic violations in a dynamic environment is a complex task. This paper focuses on detecting a particular traffic violation, such as riding a motorcycle without a helmet. We propose to address the problem as an object detection task. In this paper, we propose a novel convolutional encoder-transformer decoder architecture (CETD) for the object detection task. The proposed architecture comprises two primary modules: a convolutional neural network (CNN)-based convolution encoder that extracts high-level features from input images and a transformer-based decoder that utilizes attention mechanisms to identify relevant components, such as helmets or missing helmets, in the image. This architecture is designed to achieve accurate object detection and localization in images by combining advanced feature extraction techniques with state-of-the-art attention mechanisms. Layer normalization module of the proposed architecture acts as an intermediate bias stabilizer for the encoder-decoder network. The design also includes a standard backbone feature extractor and fused backbone feature extractor. The model gives strong confidence in detecting occluded objects compared to other state-of-the-art models. The detector works in an end-to-end fashion with fewer handcrafted features. We have studied the applicability of the model with a miniature version of the COCO dataset. The model gives a competitive performance with models like Faster region-based convolutional neural network (Faster R-CNN) and Mask region-based convolutional neural network on this dataset. The proposed model is also fine-tuned on the traffic data with occlusion for helmet detection. The model's performance on helmet detection from traffic data is comparable with the state-of-the-art real-time detectors such as EspiNet, a modified Faster R-CNN network, single-shot multibox detector (SSD), and You Only Look Once version5 (YOLO v5) detector. Specifically, the model has outperformed EspiNet V2 (a modified Faster R-CNN network) by 0.47, SSD by 3.9, and YOLO v5 by 0.37 in terms of mean average precision. Moreover, the model's mean average precision has been further improved by 0.87 using object-aware copy-paste augmentation. The model's average occlusion detection confidence is 5.1 percent more than YOLO v5. Experimental results show that the proposed model has better adaptivity on specific object (helmet) detection and generic object detection tasks.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号