首页> 外文OA文献 >A New Method for Word Segmentation from Arbitrarily-Oriented Video Text Lines
【2h】

A New Method for Word Segmentation from Arbitrarily-Oriented Video Text Lines

机译:从任意方向的视频文本行进行分词的新方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Word segmentation has become a research topic to improve OCR accuracy for video text recognition, because a video text line suffers from arbitrary orientation, complex background and low resolution. Therefore, for word segmentation from arbitrarily-oriented video text lines, in this paper, we extract four new gradient directional features for each Canny edge pixel of the input text line image to produce four respective pixel candidate images. The union of four pixel candidate images is performed to obtain a text candidate image. The sequence of the components in the text candidate image according to the text line is determined using nearest neighbor criteria. Then we propose a two-stage method for segmenting words. In the first stage, for the distances between the components, we apply K-means clustering with K=2 to get probable word and non-word spacing clusters. The words are segmented based on probable word spacing and all other components are passed to the second stage for segmenting correct words. For each segmented and un-segmented words passed to the second stage, the method repeats all the steps until the K-means clustering step to find probable word and non-word spacing clusters. Then the method considers cluster nature, height and width of the components to identify the correct word spacing. The method is tested extensively on video curved text lines, non-horizontal straight lines, horizontal straight lines and text lines from the ICDAR-2003 competition data. Experimental results and a comparative study shows the results are encouraging and promising.
机译:由于视频文本行遭受任意方向,复杂的背景和低分辨率的困扰,因此分词已成为提高OCR精度以进行视频文本识别的研究主题。因此,为了从任意方向的视频文本行中进行单词分割,在本文中,我们为输入文本行图像的每个Canny边缘像素提取了四个新的梯度方向特征,以生成四个各自的像素候选图像。执行四个像素候选图像的并集以获得文本候选图像。使用最近邻准则确定根据文本行的文本候选图像中的成分的顺序。然后,我们提出了一种分词的两阶段方法。在第一阶段,对于组件之间的距离,我们应用K = 2的K-均值聚类来获得可能的单词和非单词间距聚类。根据可能的单词间距对单词进行分段,并将所有其他组件传递到第二阶段,以对正确的单词进行分段。对于传递到第二阶段的每个分段和未分段的单词,该方法重复所有步骤,直到K-means聚类步骤找到可能的单词和非单词间距聚类为止。然后,该方法考虑簇的性质,组件的高度和宽度,以识别正确的单词间距。该方法已在ICDAR-2003比赛数据中的视频弯曲文本线,非水平直线,水平直线和文本线上进行了广泛的测试。实验结果和比较研究表明,结果令人鼓舞和充满希望。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号