首页>
外国专利>
Line segmentation method applicable to document images containing handwriting and printed text characters or skewed text lines
Line segmentation method applicable to document images containing handwriting and printed text characters or skewed text lines
展开▼
机译:行分割方法适用于包含手写和打印文字字符或偏斜文字行的文档图像
展开▼
页面导航
摘要
著录项
相似文献
摘要
A text line segmentation method for a document image containing printed text and handwriting, or document image containing skewed lines or printed text. Connected component (CC) are obtained for the document, and their bounding boxes and centroids are calculated. The CCs are categorized into three categories based on bounding box sizes: small objects, regular text objects, and large objects involving handwriting. The centroids of regular text objects are used in a cluster analysis to find the vertical centers of the N text lines. Then, each CC is classified into one of the N lines based on the vertical distance between its centroid and the vertical centers of text lines, and copied into to a corresponding object board. Extra spaces are removed from the object boards to obtain the line segments. The large object involving handwriting will be classified into one of the lines but absent from other lines.
展开▼