首页> 外国专利> A method for segmenting text characters in a document image using vertical projection of the central area of the characters

A method for segmenting text characters in a document image using vertical projection of the central area of the characters

机译:一种使用字符中心区域的垂​​直投影在文档图像中分割文本字符的方法

摘要

A word segmentation method for segmenting a text line into word segments, which is particularly advantageous for processing italic text but can also be used for regular text. A horizontal center zone of the text line, corresponding to the vertical center parts of the characters, is used to generate a center-zone-only vertical projection profile. The center zone is determined using a horizontal projection profile, by locating the two major peaks of that profile and defining the two major peak positions as the upper and lower boundaries of the center zone. Spacing segments (white gaps) in the vertical projection profile are identified, and classified into two classes, namely character spacing (gap between characters with a word) and word spacing (gap between words). The word spacings are used to segment the text line into word segments.
机译:一种用于将文本行分割成多个词段的分词方法,这对于处理斜体文本特别有利,但也可以用于常规文本。文本行的水平中心区域与字符的垂直中心部分相对应,用于生成仅中心区域的垂​​直投影轮廓。中心区域是通过使用水平投影轮廓来确定的,方法是定位该轮廓的两个主要峰,并将两个主要峰的位置定义为中心区域的上下边界。标识垂直投影轮廓中的间隔段(白色间隙),并将其分为两类,即字符间距(带有单词的字符之间的间隙)和单词间距(单词之间的间隙)。单词间距用于将文本行分段为单词段。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号