...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >A quadrilateral scene text detector with two-stage network architecture
【24h】

A quadrilateral scene text detector with two-stage network architecture

机译:具有双级网络架构的四边形场景文本探测器

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Many of the state-of-the-art methods can only localize scene texts with rotated rectangle boundaries, which may result in incorrect rectification of the detected scene texts and erroneous elimination of proposals or detections during non-maximum suppression (NMS). A few existing methods that can detect scene texts with quadrilateral boundaries, are just based on one-stage architectures or sliding windows scanning and thus have sub-optimal performance. To address these problems, we propose an end-to-end two-stage network architecture for scene text detection, which can accurately localize scene texts with quadrilateral boundaries. At the first stage, we propose a quadrilateral region proposal network (QRPN) for generating quadrilateral proposals, based on a newly proposed quadrilateral regression algorithm. At the second stage, we introduce a novel weighted RoI pooling module with learned weight masks to pool the features, and then classify the proposals and refine their shapes with the proposed quadrilateral regression algorithm again. Specially, during training, we adopt a dual-branch structure of detection heads, that is, jointly train the quadrilateral detection head and an additional rotated rectangle detection head. Furthermore, we develop an accelerated NMS algorithm with O(nlogn) complexity, for redundant quadrilateral text proposals and detections eliminating during the first and the second stage, respectively. Experiments on several challenging benchmarks demonstrate the superior performance of the proposed method, which achieves state-of-the-art results on widely used benchmarks ICDAR 2017 MLT, RCTW, and ICDAR 2015 Incidental Scene Text benchmark. (C) 2020 Elsevier Ltd. All rights reserved.
机译:许多最先进的方法只能本地化具有旋转矩形边界的场景文本,这可能导致检测到的场景文本的整改不正确,并且在非最大抑制期间错误消除提案或检测的错误消除。一些可以检测具有四边形边界的场景文本的一些现有方法,恰好基于单级架构或滑动窗口扫描,因此具有次优性能。为了解决这些问题,我们提出了一个用于场景文本检测的端到端的两阶段网络架构,可以使用四边形边界准确定位场景文本。在第一阶段,我们提出了一个四边形区域提案网络(QRPN),用于基于新提出的四边形回归算法来产生四边形建议。在第二阶段,我们介绍了一种小型加权ROI汇集模块,具有学习权重掩码来汇集功能,然后将提案分类并再次通过所提出的四边形回归算法对其形状进行分类。特别地,在培训期间,我们采用了检测头的双分支结构,即,共同列车,额外的旋转矩形检测头。此外,对于o(nlogn)复杂性的加速NMS算法,分别为冗余四边形文本提案和检测分别在第一和第二阶段消除的re(nlogn)复杂性。关于若干具有挑战性的基准测试证明了该方法的卓越性能,从而实现了最先进的基准2017年2017 MLT,RCTW和ICDAR 2015附带场景教科局部课程基准。 (c)2020 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号