首页> 外文会议>International Conference on Multimedia Modeling >TK-Text: Multi-shaped Scene Text Detection via Instance Segmentation
【24h】

TK-Text: Multi-shaped Scene Text Detection via Instance Segmentation

机译:TK-Text:通过实例分割的多形场景文本检测

获取原文

摘要

Benefit from the development of deep neural networks, scene text detectors have progressed rapidly over the past few years and achieved outstanding performance on several standard benchmarks. However, most existing methods adopt quadrilateral bounding boxes to represent texts, which are usually inadequate to deal with multi-shaped texts such as the curved ones. To keep consist detection performance on both quadrilateral and curved texts, we present a novel representation, i.e., text kernel, for multi-shaped texts. On the basis of text kernel, we propose a simple yet effective scene text detection method, named as TK-Text. The proposed method consists of three steps, namely text-context-aware network, segmentation map generation and text kernel based post-clustering. During text-context-aware network, we construct a segmentation-based network to extract feature map from natural scene images, which are further enhanced with text context information extracted from an attention scheme TKAB. In segmentation map generation, text kernels and rough boundaries of text instances are segmented based on the enhanced feature map. Finally, rough text instances are gradually refined to generate accurate text instances by performing clustering based on text kernel. Experiments on public benchmarks including SCUT-CTW1500, ICDAR 2015 and ICDAR 2017 MLT demonstrate that the proposed method achieves competitive detection performance comparing with the existing methods.
机译:得益于深度神经网络的发展,场景文本检测器在过去几年中发展迅速,并在多个标准基准上均表现出色。然而,大多数现有方法采用四边形边界框来表示文本,这通常不足以处理诸如弯曲文本之类的多种形状的文本。为了在四边形和弯曲文本上保持一致性检测性能,我们提出了一种新颖的表示形式,即用于多形文本的文本核。基于文本内核,我们提出了一种简单而有效的场景文本检测方法,称为TK-Text。所提出的方法包括三个步骤,即文本上下文感知网络,分割图生成和基于文本内核的后聚类。在感知文本上下文的网络中,我们构建了一个基于分段的网络以从自然场景图像中提取特征图,并通过从关注方案TKAB中提取的文本上下文信息进一步增强了该特征图。在分割图生成中,基于增强的特征图对文本核和文本实例的粗略边界进行分割。最后,通过基于文本内核执行聚类,逐步完善粗略的文本实例以生成准确的文本实例。在包括SCUT-CTW1500,ICDAR 2015和ICDAR 2017 MLT在内的公开基准测试中,该方法与现有方法相比具有竞争优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号