首页> 外文会议>12th International Conference on Frontiers in Handwriting Recognition >User-Defined Expected Error Rate in OCR Postprocessing by Means of Automatic Threshold Estimation
【24h】

User-Defined Expected Error Rate in OCR Postprocessing by Means of Automatic Threshold Estimation

机译:通过自动阈值估计在OCR后处理中用户定义的预期错误率

获取原文

摘要

In this work, a method for the automatic estimation of a threshold that allows the user of an OCR system to define an expected error rate is presented. When the OCR output is post-processed using a language model, a probability, a reliability index (or a ȁC;transformation costȁD;) is usually obtained, reflecting the likelihood (or its inverse) that the string of OCR hypotheses belongs to the model. Using a threshold on this index (or cost) to reject the less reliable hypotheses, a variable level of expected accuracy can be imposed on the output. It is much more convenient for the user the ability to ȁC;fixȁD; at an acceptable level the expected error rate instead of having to deal with an arbitrary threshold. Of course, the result will always be high reject rates for difficult tasks and lower reject rates for easier tasks.
机译:在这项工作中,提出了一种自动估计阈值的方法,该方法允许OCR系统的用户定义预期的错误率。当使用语言模型对OCR输出进行后处理时,通常会获得概率,可靠性指标(或ȁC;转换成本ȁD;),这反映了OCR假设字符串属于该模型的可能性(或其反函数)。 。使用此指标(或成本)的阈值来拒绝不太可靠的假设,可以将可变水平的预期准确性强加给输出。对于用户来说,使用C修复D的能力要方便得多。在可接受的水平上预期的错误率,而不必处理任意阈值。当然,对于困难的任务,结果始终是高拒绝率,而对于较简单的任务,结果将始终是低拒绝率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号