Pyrboxes: An efficient multi-scale scene text detector with feature pyramids

Sheng Fenfen; Chen Zhineng; Zhang Wei; Xu Bo

首页> 外文期刊>Pattern recognition letters >Pyrboxes: An efficient multi-scale scene text detector with feature pyramids

【24h】

Pyrboxes: An efficient multi-scale scene text detector with feature pyramids

机译：Pyrboxes：一个有效的多尺度场景文本探测器，具有金字塔

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Scene text detection has attracted many researches due to its importance to various applications. However, current approaches could not keep a good balance between accuracy and speed, i.e., a high-performance accuracy but with a low processing speed, or vice-versa. In this paper, we propose a novel model, named PyrBoxes, for efficient and effective multi-scale scene text detection. PyrBoxes consists of an SSD-based backbone that utilizes deep layers with strong semantics to detect texts in various sizes, and a proposed grouped pyramid module that leverages basic layers to append detailed locations into detection. Most existing detectors discard features from the basic layers due to the efficiency issue. We argue these layers contain fine-grained information, which is complementary to high-level semantics. Based on this, the grouped pyramid module combines the basic layers recursively into a detection layer via a top-down partition and a bottom-up group. Extensive experiments on both horizontal and oriented benchmarks, including ICDAR2013 Focused Scene Text, ICDAR2015 Incidental Text and COCO-Text, demonstrate that PyrBoxes achieves state-of-the-art or highly competitive performance compared with baselines, while runs significantly faster at inference. Furthermore, by experimenting on another ChiTVText dataset, PyrBoxes shows great generality to Chinese and long text lines. By visualizing some qualitative results, as expected, PyrBoxes provides more accurate locations and reduces the rate of missed detections, especially for small-sized texts. (C) 2019 Elsevier B.V. All rights reserved.

机译：现场文本检测由于其对各种应用的重要性而引起了许多研究。然而，目前的方法无法在精度和速度之间保持良好的平衡，即高性能精度，但处理速度低，反之亦然。在本文中，我们提出了一个名为PyRboxes的新型模型，用于高效且有效的多尺度场景文本检测。 Pyrboxes由基于SSD的骨干组成，利用具有强大语义的深层，以检测各种尺寸的文本，以及建议的分组金字塔模块，可以利用基本层将详细的位置追加到检测中。由于效率问题，大多数现有的检测器丢弃基本层的功能。我们认为这些层包含细粒度信息，这与高级语义互补。基于此，分组的金字塔模块通过自上而下的分区和自下而上组将基本层递归地递归到检测层中。对水平和面向基准的广泛实验，包括ICDAR2013集中的场景文本，ICDAR2015附带文本和Coco-Metter，表明Pyrboxes与基线相比实现了最先进的或竞争激烈的表现，而推理的推断速度明显更快。此外，通过在另一个ChitVtext数据集上进行实验，Pyrboxes对中文和长文本线显示出很大的普遍性。通过可视化某些定性结果，如预期的那样，PyRboxes提供更准确的位置并降低错过的检测率，特别是对于小型文本。（c）2019 Elsevier B.v.保留所有权利。

著录项

来源
《Pattern recognition letters》 |2019年第7期|228-234|共7页
作者
Sheng Fenfen; Chen Zhineng; Zhang Wei; Xu Bo;
展开▼
作者单位

Chinese Acad Sci Inst Automat Beijing 100190 Peoples R China|Univ Chinese Acad Sci Beijing 100190 Peoples R China;

Chinese Acad Sci Inst Automat Beijing 100190 Peoples R China;

JD AI Res Beijing 100101 Peoples R China;

Chinese Acad Sci Inst Automat Beijing 100190 Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Scene text detection; Multi-scale text detection; Grouped pyramid module; Efficient and effective;

机译：场景文本检测;多尺度文本检测;分组金字塔模块;高效有效;

相似文献

外文文献
中文文献
专利

1. Pyrboxes: An efficient multi-scale scene text detector with feature pyramids [J] . Sheng Fenfen, Chen Zhineng, Zhang Wei, Pattern recognition letters . 2019,第JULa期

机译：Pyrboxes：具有特征金字塔的高效多尺度场景文本检测器
2. Deep Multi-Scale Context Aware Feature Aggregation for Curved Scene Text Detection [J] . Dai Pengwen, Zhang Hua, Cao Xiaochun IEEE transactions on multimedia . 2020,第8期

机译：深度多尺度上下文感知特征聚合，用于弯曲场景文本检测
3. A scene text detector based on deep feature merging [J] . Zhang Yong, Huang Yubei, Zhao Donning, Multimedia Tools and Applications . 2021,第19期

机译：基于深度特征合并的场景文本探测器
4. Feature Pyramid Based Scene Text Detector [C] . MengYi En, Rong Li, JianQiang Li, IAPR International Conference on Document Analysis and Recognition . 2017

机译：基于特征金字塔的场景文本检测器
5. Multi-Scale Object Detection in Aerial Images with Feature Pyramid Networks [D] . Bhattarai, Sujan 2018

机译：具有特征金字塔网络的航空影像多尺度目标检测
6. Multi-Scale Spatial Concatenations of Local Features in Natural Scenes and Scene Classification [O] . Xiaoyuan Zhu, Zhiyong Yang -1

机译：自然场景和场景分类中局部特征的多尺度空间级联
7. FTPN: Scene Text Detection With Feature Pyramid Based Text Proposal Network [O] . Fagui Liu, Cheng Chen, Dian Gu, 2019

机译：FTPN：现场文本检测与特征基于金字塔的文本提案网络

Pyrboxes: An efficient multi-scale scene text detector with feature pyramids

摘要

著录项

相似文献

相关主题

期刊订阅