【24h】

Beyond OCRs for Document Blur Estimation

机译:超越OCR的文档模糊估计

获取原文

摘要

The current document blur/quality estimation algorithms rely on the OCR accuracy to measure their success. A sharp document image, however, at times may yield lower OCR accuracy owing to factors independent of blur or quality of capture. The necessity to rely on OCR is mainly due to the difficulty in quantifying the quality otherwise. In this work, we overcome this limitation by proposing a novel dataset for document blur estimation, for which we physically quantify the blur using a capture set-up which computationally varies the focal distance of the camera. We also present a selective search mechanism to improve upon the recently successful patch-based learning approaches (using codebooks or convolutional neural networks). We present a thorough analysis of the improved blur estimation pipeline using correlation with OCR accuracy as well as the actual amount of blur. Our experiments demonstrate that our method outperforms the current state-of-the-art by a significant margin.
机译:当前的文档模糊/质量估计算法依靠OCR准确性来衡量其成功。但是,由于与模糊或捕获质量无关的因素,清晰的文档图像有时可能会产生较低的OCR精度。依赖OCR的必要性主要是由于否则难以量化质量。在这项工作中,我们通过提出一种用于文档模糊估计的新颖数据集来克服此限制,为此,我们使用捕获设置物理上量化模糊,该捕获设置在计算上改变了相机的焦距。我们还提出了一种选择性搜索机制,以改进最近成功的基于补丁的学习方法(使用码本或卷积神经网络)。我们使用与OCR精度的相关性以及实际的模糊量,对改进的模糊估计流水线进行了全面的分析。我们的实验表明,我们的方法大大优于当前的最新技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号