首页> 外国专利> Line segmentation method applicable to document images containing handwriting and printed text characters or skewed text lines

Line segmentation method applicable to document images containing handwriting and printed text characters or skewed text lines

机译:行分割方法适用于包含手写和打印文字字符或偏斜文字行的文档图像

摘要

A text line segmentation method for a document image containing printed text and handwriting, or document image containing skewed lines or printed text. Connected component (CC) are obtained for the document, and their bounding boxes and centroids are calculated. The CCs are categorized into three categories based on bounding box sizes: small objects, regular text objects, and large objects involving handwriting. The centroids of regular text objects are used in a cluster analysis to find the vertical centers of the N text lines. Then, each CC is classified into one of the N lines based on the vertical distance between its centroid and the vertical centers of text lines, and copied into to a corresponding object board. Extra spaces are removed from the object boards to obtain the line segments. The large object involving handwriting will be classified into one of the lines but absent from other lines.
机译:一种文本行分割方法,用于包含打印文本和手写内容的文档图像,或包含倾斜线或打印文本的文档图像。为该文档获取连接的分量(CC),并计算其边界框和质心。 CC根据边界框的大小分为三类:小对象,常规文本对象和涉及手写的大对象。常规文本对象的质心在聚类分析中用于查找N条文本行的垂直中心。然后,根据每个CC的质心和文本行的垂直中心之间的垂直距离将其分类为N条线之一,并复制到相应的对象板上。从目标板上删除多余的空间以获得线段。涉及手写的大对象将被分类为其中的一行,而其他行则不存在。

著录项

  • 公开/公告号US9104940B2

    专利类型

  • 公开/公告日2015-08-11

    原文格式PDF

  • 申请/专利权人 KONICA MINOLTA LABORATORY U.S.A. INC.;

    申请/专利号US201314015048

  • 发明设计人 CHAOHONG WU;

    申请日2013-08-30

  • 分类号G06K9/34;

  • 国家 US

  • 入库时间 2022-08-21 15:23:08

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号