首页> 外文期刊>Multimedia Tools and Applications >Scene text detection with fully convolutional neural networks
【24h】

Scene text detection with fully convolutional neural networks

机译:全卷积神经网络的场景文本检测

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Text detection in scene image has become a hot topic in computer vision and artificial intelligence research, due to its wide range of applications and challenges. Most state-of-the-art methods for text detection based on deep learning rely on text bounding box regression. These methods can not well handle the case that if the scene text is curved. In this paper, we propose a new framework for arbitrarily oriented text detection in natural images based on fully convolutional neural networks. The main idea is to represent a text instance by two forms: text center block and word stroke region. These two elements are detected by two fully convolutional networks, respectively. Final detections are produced by the word region surrounding box algorithm. The proposed method does not need to regress the extant bounding box of the text instance, mainly because the predicted text block region itself implicitly contains position and orientation information. Besides, our method can well handle text in different languages, arbitrary orientations, curved shape and various fonts. To validate the effectiveness of the proposed method, we perform experiments on three public datasets: MSRA-TD500, USTB-SV1K and ICDAR2013, and compare it with other state-of-the-art methods. Experiment results demonstrate that the proposed method achieves competitive results. Based on VGG-16, our method achieves an F-measure of 78.84% on MSRA-TD500, 59.34% on USTB-SV1K, and 88.21% on ICDAR2013.
机译:场景图像中的文本检测由于其广泛的应用和挑战,已成为计算机视觉和人工智能研究的热门话题。大多数基于深度学习的文本检测方法都依赖于文本边界框回归。这些方法不能很好地处理场景文本弯曲的情况。在本文中,我们提出了一种基于完全卷积神经网络的自然图像中任意方向文本检测的新框架。主要思想是用两种形式表示文本实例:文本中心块和单词笔划区域。这两个元素分别由两个完全卷积网络检测。最终检测由单词区域包围盒算法产生。提出的方法不需要回归文本实例的现有边界框,主要是因为预测的文本块区域本身隐式包含位置和方向信息。此外,我们的方法可以很好地处理不同语言,任意方向,弯曲形状和各种字体的文本。为了验证该方法的有效性,我们在三个公共数据集上进行了实验:MSRA-TD500,USTB-SV1K和ICDAR2013,并将其与其他最新方法进行了比较。实验结果表明,该方法取得了较好的效果。基于VGG-16,我们的方法在MSRA-TD500上的F度量达到78.84%,在USTB-SV1K上达到59.34%,在ICDAR2013上达到88.21%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号