首页> 外文期刊>Pattern Analysis and Applications >A novel method for binarization of scene text images and its application in text identification
【24h】

A novel method for binarization of scene text images and its application in text identification

机译:一种场景文本图像二值化的新方法及其在文本识别中的应用

获取原文
获取原文并翻译 | 示例

摘要

The aim of this article is twofold. First, we propose an effective methodology for binarization of scene images. For our present study, we use the publicly available ICDAR 2011 Born Digital Data set. We introduce a new concept of variance map of a gray-level image for detection of text boundary in an image. Based on this boundary information, the image is binarized by means of adaptive thresholding. This binarization procedure produces a number of connected components. Next, these connected components are examined in order to identify possible text components. In this context, a number of shape-based features that distinguish between text and non-text components are proposed. We consider text component identification as an one-class classification problem, i.e., the ground truth information for only the text class is available for the ICDAR 2011 Born Digital Data set. Then, the ground truth text components are used to obtain a certain statistical distribution of the shape-based features. Here, we observe that all the features may not follow a single family of distributions. Therefore, we construct a joint distribution by using multivariate Gaussian copula which allows a coupling of different marginal distributions. As our experiments suggest, the copula-based method is superior to multivariate Gaussian distribution in describing the feature distribution. Finally, a text connected component of an unknown class is subjected to the trained statistical model, and by performing a hypothesis test we successfully identify a possible text component. For a comparative study, we consider a number of state-of-the-art methods. Our proposed approach significantly outperforms most of these methods in terms of recall, precision and F-measure in both the binarization and text identification tasks.
机译:本文的目的是双重的。首先,我们提出了一种用于场景图像二值化的有效方法。对于我们目前的研究,我们使用可公开获得的ICDAR 2011 Born数字数据集。我们引入了一种用于检测图像中文本边界的灰度图像方差图的新概念。基于该边界信息,通过自适应阈值化对图像进行二值化。该二值化过程产生许多连接的组件。接下来,检查这些连接的组件以识别可能的文本组件。在这种情况下,提出了许多区分文本和非文本成分的基于形状的特征。我们将文本组件识别视为一类分类问题,即ICDAR 2011出生数字数据集仅提供文本类别的地面真相信息。然后,使用地面真实文本组件来获取基于形状的特征的特定统计分布。在这里,我们观察到所有功能可能都不遵循一个单一的分布族。因此,我们通过使用允许耦合不同边际分布的多元高斯copula构造联合分布。正如我们的实验所表明的那样,基于copula的方法在描述特征分布方面优于多元高斯分布。最后,对未知类别的文本连接组件进行训练后的统计模型,并且通过执行假设检验,我们成功地确定了可能的文本组件。为了进行比较研究,我们考虑了许多最先进的方法。在二值化和文本识别任务中,我们提出的方法在召回率,精度和F量度方面大大优于大多数方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号