首页> 外文会议>Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on >OTCYMIST: Otsu-Canny Minimal Spanning Tree for Born-Digital Images
【24h】

OTCYMIST: Otsu-Canny Minimal Spanning Tree for Born-Digital Images

机译:OTCYMIST:Otsu-Canny最小生成树以生成数字图像

获取原文
获取原文并翻译 | 示例

摘要

Text segmentation and localization algorithms are proposed for the born-digital image dataset. Binarization and edge detection are separately carried out on the three colour planes of the image. Connected components (CC's) obtained from the binarized image are thresholded based on their area and aspect ratio. CC's which contain sufficient edge pixels are retained. A novel approach is presented, where the text components are represented as nodes of a graph. Nodes correspond to the centroids of the individual CC's. Long edges are broken from the minimum spanning tree of the graph. Pair wise height ratio is also used to remove likely non-text components. A new minimum spanning tree is created from the remaining nodes. Horizontal grouping is performed on the CC's to generate bounding boxes of text strings. Overlapping bounding boxes are removed using an overlap area threshold. Non-overlapping and minimally overlapping bounding boxes are used for text segmentation. Vertical splitting is applied to generate bounding boxes at the word level. The proposed method is applied on all the images of the test dataset and values of precision, recall and H-mean are obtained using different approaches.
机译:针对出生数字图像数据集,提出了文本分割和定位算法。在图像的三个颜色平面上分别执行二值化和边缘检测。从二值化图像中获得的连接分量(CC)会根据其面积和纵横比来设定阈值。保留包含足够边缘像素的CC。提出了一种新颖的方法,其中文本组件表示为图的节点。节点对应于各个CC的质心。长边从图的最小生成树断开。逐对高度比率也用于删除可能的非文本组件。从其余节点创建一个新的最小生成树。在CC上执行水平分组以生成文本字符串的边界框。使用重叠区域阈值可以删除重叠的边界框。非重叠和最小重叠的边界框用于文本分割。应用垂直拆分以在单词级别生成边界框。所提出的方法应用于测试数据集的所有图像,并使用不同的方法获得精度,查全率和H-均值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号