首页> 外文会议>Asian Conference on Computer Vision >Detecting Text in the Wild with Deep Character Embedding Network
【24h】

Detecting Text in the Wild with Deep Character Embedding Network

机译:深度字符嵌入网络在野外检测文本

获取原文

摘要

Most text detection methods hypothesize texts are horizontal or multi-oriented and thus define quadrangles as the basic detection unit. However, text in the wild is usually perspectively distorted or curved, which can not be easily tackled by existing approaches. In this paper, we propose a deep character embedding network (CENet) which simultaneously predicts the bounding boxes of characters and their embedding vectors, thus making text detection a simple clustering task in the character embedding space. The proposed method does not require strong assumptions of forming a straight line on general text detection, which provides flexibility on arbitrarily curved or perspectively distorted text. For character detection task, a dense prediction subnetwork is designed to obtain the confidence score and bounding boxes of characters. For character embedding task, a subnet is trained with contrastive loss to project detected characters into embedding space. The two tasks share a backbone CNN from which the multi-scale feature maps are extracted. The final text regions can be easily achieved by a thresholding process on character confidence and embedding distance of character pairs. We evaluated our method on ICDAR13, ICDAR15, MSRA-TD500, and Total Text. The proposed method achieves state-of-the-art or comparable performance on all of the datasets, and shows a substantial improvement in the irregular-text datasets, i.e. Total-Text.
机译:大多数文本检测方法都假设文本是水平的或多方向的,因此将四边形定义为基本检测单位。但是,野外文本通常会在角度上变形或弯曲,这无法通过现有方法轻松解决。在本文中,我们提出了一种深度字符嵌入网络(CENet),该网络可以同时预测字符的边界框及其嵌入向量,从而使文本检测成为字符嵌入空间中的简单聚类任务。所提出的方法不需要在普通文本检测上形成直线的强力假设,这为任意弯曲或透视变形的文本提供了灵活性。对于字符检测任务,设计了一个密集的预测子网,以获得字符的置信度得分和边界框。对于字符嵌入任务,使用对比损失来训练子网,以将检测到的字符投影到嵌入空间中。这两个任务共享一个主干CNN,从中提取多尺度特征图。通过对字符置信度和字符对的嵌入距离进行阈值处理,可以轻松实现最终文本区域。我们在ICDAR13,ICDAR15,MSRA-TD500和Total Text上评估了我们的方法。所提出的方法在所有数据集上都达到了最先进的或可比的性能,并且显示了不规则文本数据集(即Total-Text)的实质性改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号