首页> 外文OA文献 >OmniPD: One-Step Person Detection in Top-View Omnidirectional Indoor Scenes
【2h】

OmniPD: One-Step Person Detection in Top-View Omnidirectional Indoor Scenes

机译:Omnipd:以顶视图的全部内向室内场景中的一步人检测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We propose a one-step person detector for topview omnidirectional indoor scenes based on convolutional neural networks (CNNs). While state of the art person detectors reach competitive results on perspective images, missing CNN architectures as well as training data that follows the distortion of omnidirectional images makes current approaches not applicable to our data. The method predicts bounding boxes of multiple persons directly in omnidirectional images without perspective transformation, which reduces overhead of pre- and post-processing and enables realtime performance. The basic idea is to utilize transfer learning to fine-tune CNNs trained on perspective images with data augmentation techniques for detection in omnidirectional images. We fine-tune two variants of Single Shot MultiBox detectors (SSDs). The first one uses Mobilenet v1 FPN as feature extractor (moSSD). The second one uses ResNet50 v1 FPN (resSSD). Both models are pre-trained on Microsoft Common Objects in Context (COCO) dataset. We fine-tune both models on PASCAL VOC07 and VOC12 datasets, specifically on class person. Random 90-degree rotation and random vertical flipping are used for data augmentation in addition to the methods proposed by original SSD. We reach an average precision (AP) of 67.3%with moSSD and 74.9%with resSSD on the evaluation dataset. To enhance the fine-tuning process, we add a subset of HDA Person dataset and a subset of PIROPO database and reduce the number of perspective images to PASCAL VOC07. The AP rises to 83.2% for moSSD and 86.3% for resSSD, respectively. The average inference speed is 28 ms per image for moSSD and 38 ms per image for resSSD using Nvidia Quadro P6000. Our method is applicable to other CNN-based object detectors and can potentially generalize for detecting other objects in omnidirectional images.
机译:我们提出了基于卷积神经网络(细胞神经网络)冠捷全方位的室内场景的一步人员探测器。虽然艺术的人探测器状态的透视图像达到竞争的结果,缺少CNN架构以及培训遵循全方位图像的失真数据使得并不适用于我们的数据电流的方法。该方法预测边界多个人直接在全方位图像的框而不透视变换,这减少了管理费用预处理和后处理的并能够实时性能。其基本思想是利用转印学习到在全方位图像上训练用数据增量技术的透视图像进行检测微调细胞神经网络。我们微调两个1射击MultiBox的检测器(SSD)的变体。第一个使用Mobilenet V1 FPN作为特征提取器(moSSD)。第二个使用ResNet50 V1 FPN(resSSD)。两种型号都预先训练对微软常见于上下文(COCO)DataSet对象。我们微调无论在PASCAL VOC07和VOC12数据集模型,特别是对类人。随机90度旋转和随机垂直翻转用于数据增强除了由原始SSD提出的方法。我们在评估数据集resSSD达到67.3%,与moSSD和74.9%的平均精度(AP)。为了提高微调的过程中,我们添加HDA人数据集的一个子集,PIROPO数据库的一个子集,并减少透视图像PASCAL VOC07的数量。美联社分别上升到83.2%,为moSSD和86.3%的resSSD。平均推理速度是每图像28毫秒moSSD和每图像38毫秒resSSD使用的NVIDIA Quadro P6000。我们的方法也适用于其他基于CNN对象检测器和可以潜在地概括为在全方位图像检测其他对象。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号