【24h】

Beyond OCRs for Document Blur Estimation

机译:超越OCRS文件模糊估计

获取原文

摘要

The current document blur/quality estimation algorithms rely on the OCR accuracy to measure their success. A sharp document image, however, at times may yield lower OCR accuracy owing to factors independent of blur or quality of capture. The necessity to rely on OCR is mainly due to the difficulty in quantifying the quality otherwise. In this work, we overcome this limitation by proposing a novel dataset for document blur estimation, for which we physically quantify the blur using a capture set-up which computationally varies the focal distance of the camera. We also present a selective search mechanism to improve upon the recently successful patch-based learning approaches (using codebooks or convolutional neural networks). We present a thorough analysis of the improved blur estimation pipeline using correlation with OCR accuracy as well as the actual amount of blur. Our experiments demonstrate that our method outperforms the current state-of-the-art by a significant margin.
机译:目前的文档模糊/质量估计算法依赖于OCR精度来衡量其成功。然而,由于与模糊或捕获质量无关的因素,有时可能会产生较低的OCR精度。依赖OCR的必要性主要是由于难以量化质量。在这项工作中,我们通过提出用于文档模糊估计的新型数据集来克服这些限制,我们使用捕获设置来物理量化模糊,从而计算相机的焦距。我们还提出了一种选择性搜索机制来提高最近成功的补丁学习方法(使用码本或卷积神经网络)。我们对使用与OCR精度相关的相关模糊估计管道以及实际模糊量来彻底分析。我们的实验表明,我们的方法优于目前最先进的余量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号