...
首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Document image binarization based on texture features
【24h】

Document image binarization based on texture features

机译:基于纹理特征的文档图像二值化

获取原文
获取原文并翻译 | 示例
           

摘要

Binarization has been difficult for document images with poor contrast, strong noise, complex patterns, and/or variable modalities in gray-scale histograms. We developed a texture feature based thresholding algorithm to address this problem. Our algorithm consists of three steps: 1) candidate thresholds are produced through iterative use of Otsu's algorithm (1978); 2) texture features associated with each candidate threshold are extracted from the run-length histogram of the accordingly binarized image; 3) the optimal threshold is selected so that desirable document texture features are preserved. Experiments with 9,000 machine printed address blocks from an unconstrained US mail stream demonstrated that over 99.6 percent of the images were successfully binarized by the new thresholding method, appreciably better than those obtained by typical existing thresholding techniques. Also, a system run with 500 troublesome mail address blocks showed that an 8.1 percent higher character recognition rate was achieved with our algorithm as compared with Otsu's algorithm.
机译:对于对比度差,噪声大,图案复杂和/或灰度直方图中的模态可变的文档图像,很难进行二值化处理。我们开发了基于纹理特征的阈值算法来解决这个问题。我们的算法包括三个步骤:1)通过反复使用Otsu算法(1978)产生候选阈值; 2)从相应的二值化图像的游程直方图中提取与每个候选阈值相关的纹理特征; 3)选择最佳阈值,以便保留所需的文档纹理特征。对来自不受限制的美国邮件流的9,000个机器打印的地址块进行的实验表明,新的阈值方法成功地对99.6%的图像进行了二值化处理,明显好于通过典型的现有阈值技术获得的图像。另外,运行有500个麻烦的邮件地址块的系统显示,与大津的算法相比,我们的算法可将字符识别率提高8.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号