首页> 外国专利> Method and system for preprocessing an image for optical character recognition

Method and system for preprocessing an image for optical character recognition

机译:预处理图像以进行光学字符识别的方法和系统

摘要

A method and system for preprocessing an image, wherein the image includes a plurality of columns, or regions, of text is disclosed. A plurality of components associated with the text is determined. On determining the plurality of components, a line height and a column spacing is determined for the components. The components are then associated with a column based on the line height and the column spacing. A set of characteristic parameters are calculated for each column and the plurality of components of each column are merged based on the characteristic parameters to form sub-words and words. A first plurality of words and/or subwords is merged and processed as a first region and a second plurality of words and/or subwords is merged and processed as a second region wherein at least a portion of the second region vertically overlaps at least a portion of the first region.
机译:公开了一种用于预处理图像的方法和系统,其中图像包括文本的多个列或区域。确定与文本关联的多个组件。在确定多个组件时,为这些组件确定行高和列间距。然后根据行高和列间距将组件与列关联。为每一列计算一组特征参数,并且基于该特征参数合并每一列的多个分量以形成子词和词。将第一多个单词和/或子单词合并并处理为第一区域,而将第二多个单词和/或子单词合并并处理为第二区域,其中第二区域的至少一部分垂直重叠至少一部分第一个区域。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号