首页> 外文会议>Conference on Visual Information Processing >Optimizing OCR accuracy for bi-tonal, noisy scans of degraded Arabic documents
【24h】

Optimizing OCR accuracy for bi-tonal, noisy scans of degraded Arabic documents

机译:优化Bi-Tenal的OCR准确性,嘈杂的阿拉伯文档扫描

获取原文

摘要

Acquiring foreign language from degraded hardcopy documents is of interest to military and border control applications. Bi-tonal image scans are desirable because file size is small. However, the nature of hardcopy degradations and the scanner or image enhancement software capabilities used directly affect the quality of the captured image and the extent of language acquisition. We applied a collection of manual treatments to hardcopy Arabic documents to develop a corpus of bi-tonal images. We then used this corpus in an exploratory study to derive conclusions about how bi-tonal images could be enhanced. This paper discusses the manually degraded Arabic document corpus, the image enhancement study, and the significant optical character recognition (OCR) improvements obtained with simple scanner driver adjustments.
机译:从退化的硬拷贝文件获取外语对军事和边境管制应用感兴趣。双音调图像扫描是可取的,因为文件大小很小。然而,硬拷贝降级的性质和扫描仪或图像增强软件能力直接影响捕获图像的质量和语言获取的程度。我们将一系列手动治疗方法应用于硬拷贝阿拉伯文档,以制定双音调图像的语料库。然后,我们在探索性研究中使用了这种语料库来得出关于如何提高双色调图像的结论。本文讨论了使用简单扫描仪驱动器调整获得的手动降级的阿拉伯文档语料库,图像增强研究和显着的光学字符识别(OCR)改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号