首页> 外国专利> Automatic separation of text from background in scanned images of complex documents

Automatic separation of text from background in scanned images of complex documents

机译:在复杂文档的扫描图像中自动从背景中分离文本

摘要

A system that converts a scanned image of a complex document into an image where text has been preserved and separated from the background. The system first subdivides the scanned image into blocks and then examines each block pixel by pixel to construct a histogram of the gray scale values of the pixels. The histogram is partitioned into a first, middle and last regions. If one or more peaks occur in the first and last regions, and a single histogram peak occurs within the middle region, the pixels are reexamined to determine the frequency of occurrence of pixels having a gray scale level of the middle peak nearby pixels which have a level of a first region peak. If this frequency is high, the middle peak is assumed to be background information. After determining the threshold, the system rescans the block applying the threshold to separate the text from background information within the block.
机译:一种将复杂文档的扫描图像转换为保留了文本并与背景分开的图像的系统。系统首先将扫描的图像细分为块,然后逐个像素检查每个块,以构建像素灰度值的直方图。直方图被划分为第一个,中间和最后一个区域。如果在第一个和最后一个区域中出现一个或多个峰值,并且在中间区域中出现一个直方图峰值,则重新检查像素,以确定具有中间峰值的灰度级的像素出现像素的频率,该像素附近具有第一区域峰值的水平。如果此频率较高,则假定中间峰为背景信息。确定阈值后,系统将应用阈值重新扫描该块,以将文本与该块内的背景信息分开。

著录项

  • 公开/公告号US5280367A

    专利类型

  • 公开/公告日1994-01-18

    原文格式PDF

  • 申请/专利权人 HEWLETT-PACKARD COMPANY;

    申请/专利号US19910705838

  • 发明设计人 OSCAR A. ZUNIGA;

    申请日1991-05-28

  • 分类号H04N1/38;H04N1/40;H04N1/415;G06K9/34;

  • 国家 US

  • 入库时间 2022-08-22 04:32:24

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号