首页> 外文OA文献 >Text Extraction from Historical Document Images by the Combination of Several Thresholding Techniques
【2h】

Text Extraction from Historical Document Images by the Combination of Several Thresholding Techniques

机译:通过多种阈值技术的组合从历史文档图像提取文本

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper presents a new technique for the binarization of historical document images characterized by deteriorations and damages making their automatic processing difficult at several levels. The proposed method is based on hybrid thresholding combining the advantages of global and local methods and on the mixture of several binarization techniques. Two stages have been included. In the first stage, global thresholding is applied on the entire image and two different thresholds are determined from which the most of image pixels are classified into foreground or background. In the second stage, the remaining pixels are assigned to foreground or background classes based on local analysis. In this stage, several local thresholding methods are combined and the final binary value of each remaining pixel is chosen as the most probable one. The proposed technique has been tested on a large collection of standard and synthetic documents and compared with well-known methods using standard measures and was shown to be more powerful.
机译:本文提出了一种新的技术,用于历史文档图像的二值化,其特征在于劣化和损坏,使其自动处理在几个层面上难以实现。所提出的方法基于混合阈值,与全局和局部方法的优点以及多种二值化技术的混合相结合。已包括两个阶段。在第一阶段,在整个图像上应用全局阈值处理,并确定两个不同的阈值,从中确定大部分图像像素被分类为前景或背景。在第二阶段,基于本地分析将其余像素分配给前景或背景类。在该阶段,组合了几种局部阈值处理方法,并且选择了每个剩余像素的最终二进制值作为最可能的一个。所提出的技术已经在大量的标准和合成文档上进行了测试,并与使用标准措施的众所周知的方法进行比较,并显示出更强大的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号