首页> 外文会议>IAPR International Workshop on Document Analysis Systems >OCR Performance Prediction Using a Bag of Allographs and Support Vector Regression
【24h】

OCR Performance Prediction Using a Bag of Allographs and Support Vector Regression

机译:OCR性能预测使用一袋单签码和支持向量回归

获取原文

摘要

In this paper, we describe a novel and simple technique for prediction of OCR results without using any OCR. The technique uses a bag of allographs to characterize textual components. Then a support vector regression (SVR) technique is used to build a predictor based on the bag of allographs. The performance of the system is evaluated on a corpus of historical documents. The proposed technique produces correct prediction of OCR results on training and test documents within the range of standard deviation of 4.18% and 6.54% respectively. The proposed system has been designed as a tool to assist selection of corpora in libraries and specify the typical performance that can be expected on the selection.
机译:在本文中,我们描述了一种用于预测OCR结果而不使用任何OCR的新颖和简单的技术。该技术使用一袋单个录音带来表征文本组件。然后,支持向量回归(SVR)技术用于基于同名袋构建预测器。系统的性能是在历史文档的语料库中进行评估。所提出的技术在标准差范围内的培训和测试文献中,可以正确预测OCR结果分别为4.18%和6.54%。建议的系统被设计为协助图书馆中的语料库的工具,并指定在选择上可以预期的典型性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号