首页> 外文期刊>IEEE Transactions on Circuits and Systems for Video Technology >Gradient Vector Flow and Grouping-Based Method for Arbitrarily Oriented Scene Text Detection in Video Images
【24h】

Gradient Vector Flow and Grouping-Based Method for Arbitrarily Oriented Scene Text Detection in Video Images

机译:基于梯度矢量流和分组的视频图像场景文本任意检测方法

获取原文
获取原文并翻译 | 示例
       

摘要

Text detection in videos is challenging due to low resolution and complex background of videos. Besides, an arbitrary orientation of scene text lines in video makes the problem more complex and challenging. This paper presents a new method that extracts text lines of any orientations based on gradient vector flow (GVF) and neighbor component grouping. The GVF of edge pixels in the Sobel edge map of the input frame is explored to identify the dominant edge pixels which represent text components. The method extracts edge components corresponding to dominant pixels in the Sobel edge map, which we call text candidates (TC) of the text lines. We propose two grouping schemes. The first finds nearest neighbors based on geometrical properties of TC to group broken segments and neighboring characters which results in word patches. The end and junction points of skeleton of the word patches are considered to eliminate false positives, which output the candidate text components (CTC). The second is based on the direction and the size of the CTC to extract neighboring CTC and to restore missing CTC, which enables arbitrarily oriented text line detection in video frame. Experimental results on different datasets, including arbitrarily oriented text data, nonhorizontal and horizontal text data, Hua's data and ICDAR-03 data (camera images), show that the proposed method outperforms existing methods in terms of recall, precision and f-measure.
机译:由于视频的低分辨率和复杂的背景,因此视频中的文本检测具有挑战性。此外,视频中场景文本行的任意方向使问题变得更加复杂和具有挑战性。本文提出了一种新方法,该方法基于梯度矢量流(GVF)和邻域分量分组提取任意方向的文本行。探索输入帧的Sobel边缘图中的边缘像素的GVF,以识别代表文本成分的主要边缘像素。该方法提取与Sobel边缘图中的优势像素相对应的边缘分量,我们将其称为文本行的文本候选(TC)。我们提出了两种分组方案。第一种方法基于TC的几何特性找到最接近的邻居,以对折断的段和相邻字符进行分组,从而产生单词补丁。单词补丁的骨架的端点和连接点被认为可以消除误报,误报会输出候选文本成分(CTC)。第二种方法基于CTC的方向和大小来提取相邻的CTC并恢复丢失的CTC,从而可以在视频帧中任意定位文本行。在不同数据集上的实验结果,包括任意方向的文本数据,非水平和水平文本数据,Hua的数据和ICDAR-03数据(相机图像),表明该方法在查全率,精度和f测度方面优于现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号