首页> 外文会议>Visual Information Processing XIV >Optimizing OCR accuracy for bi-tonal, noisy scans of degraded Arabic documents
【24h】

Optimizing OCR accuracy for bi-tonal, noisy scans of degraded Arabic documents

机译:优化OCR精度,以对降解的阿拉伯文档进行双色调,嘈杂的扫描

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Acquiring foreign language from degraded hardcopy documents is of interest to military and border control applications. Bi-tonal image scans are desirable because file size is small. However, the nature of hardcopy degradations and the scanner or image enhancement software capabilities used directly affect the quality of the captured image and the extent of language acquisition. We applied a collection of manual treatments to hardcopy Arabic documents to develop a corpus of bi-tonal images. We then used this corpus in an exploratory study to derive conclusions about how bi-tonal images could be enhanced. This paper discusses the manually degraded Arabic document corpus, the image enhancement study, and the significant optical character recognition (OCR) improvements obtained with simple scanner driver adjustments.
机译:从降级的硬拷贝文档中获取外语对于军事和边境控制应用很重要。由于文件大小较小,因此希望进行黑白图像扫描。但是,硬拷贝降级的性质以及所使用的扫描仪或图像增强软件功能会直接影响所捕获图像的质量和语言获取的程度。我们应用了一系列手动处理方法来对阿拉伯文文档进行硬拷贝,以开发出黑白图像的语料库。然后,我们在探索性研究中使用了该语料库,以得出有关如何增强黑白图像的结论。本文讨论了手动降解的阿拉伯文档语料库,图像增强研究以及通过简单的扫描仪驱动程序调整获得的显着光学字符识别(OCR)改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号