首页> 外文期刊>Neurocomputing >Scene text segmentation using low variation extremal regions and sorting based character grouping
【24h】

Scene text segmentation using low variation extremal regions and sorting based character grouping

机译:使用低变化极值区域和基于排序的字符分组进行场景文本分割

获取原文
获取原文并翻译 | 示例

摘要

Extraction of textual information from natural scene images is a challenging task due to imaging conditions and diversity of text properties. Segmentation of scene text is important step in the pipeline that significantly affects the final recognition performance. In this paper I propose a new scene text segmentation method. Firstly, a novel approach for character candidates generation based on extremal regions (ERs) is introduced. Subpaths having low area variation are extracted from ER tree. Instead of using minimum variation criterion for selection of character candidates, position of ER in extracted subpath is used as criterion for that purpose. Each subpath is represented by one ER that is sent to SVM-based classification step. After that a novel method for character candidates grouping is used to discard non-character objects that are wrongly classified as characters. Proposed approach estimates vertical positions of the lines by sorting y coordinates of region centroids and checks spatial relation of adjacent regions in the line. This step enhances precision significantly and has lower computational complexity compared to hierarchical clustering methods. Finally, the last step is restoration of character ERs erroneously eliminated by SVM classifier where text layout properties are exploited to correct false negative classifications. Experimental results obtained on the ICDAR 2013 dataset show that the proposed character candidates generation method efficiently prunes repeating regions and achieves character recall rate superior to recently published ER based method. Proposed segmentation algorithm obtains competitive performance compared to state-of-the-art methods. (C) 2017 Elsevier B.V. All rights reserved.
机译:由于成像条件和文本属性的多样性,从自然场景图像中提取文本信息是一项艰巨的任务。场景文本的分割是管道中重要的一步,它会严重影响最终的识别性能。在本文中,我提出了一种新的场景文本分割方法。首先,介绍了一种基于极端区域(ER)的候选字符生成新方法。从ER树提取具有低区域变化的子路径。代替使用最小变化准则来选择字符候选者,将ER在提取的子路径中的位置用作该目的的准则。每个子路径由一个ER表示,该ER发送到基于SVM的分类步骤。之后,使用一种新颖的字符候选者分组方法来丢弃被错误分类为字符的非字符对象。提议的方法通过对区域质心的y坐标进行排序来估计线的垂直位置,并检查线中相邻区域的空间关系。与分层聚类方法相比,此步骤可显着提高精度并降低计算复杂度。最后,最后一步是恢复被SVM分类器错误消除的字符ER,其中利用文本布局属性来纠正错误的否定分类。在ICDAR 2013数据集上获得的实验结果表明,所提出的候选字符生成方法可有效修剪重复区域,并获得优于最近基于ER的方法的字符召回率。与最先进的方法相比,提出的分割算法具有竞争优势。 (C)2017 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号