首页> 外国专利> Method and means for improving optical character recognition (OCR) of printed documents

Method and means for improving optical character recognition (OCR) of printed documents

机译:改进印刷文件的光学字符识别(OCR)的方法和装置

摘要

The document markers containing the first values, which are dependent on the layout and content of the document and are assigned by the creation or processing software, are provided as machine readable symbolic representations on the document surface in the printed form. Markers contain coded document placement information and values assigned on a sequence of original text, which values include decimation sequences, error correction codes, or checksums depending on the text. When scanning with optical character recognition or when performing other digitized reproduction, the markers are also scanned. The scanning computer has corresponding software and allocates second values depending on the arrangement and content of the reproduced document. When comparing the first and second decimation sequences, line and character errors are detected and some errors are corrected to produce rearranged sequences. An optional correction code may provide a better correction function when applied to the rearranged reproduced document sequences and an optional check-sum comparison may be used to verify that the accuracy of the reproduced sequences is correct.
机译:包含第一值的文档标记以印刷形式在文档表面上提供为机器可读的符号表示,这些文档标记取决于文档的布局和内容并由创建或处理软件分配。标记包含编码的文档放置信息和在原始文本序列上分配的值,这些值包括抽取序列,纠错码或取决于文本的校验和。使用光学字符识别进行扫描时或执行其他数字化复制时,也会扫描标记。扫描计算机具有相应的软件,并根据复制文档的排列和内容分配第二值。当比较第一和第二抽取序列时,将检测到行和字符错误,并纠正了一些错误以产生重新排列的序列。当将可选的校正码应用于重新排列的再现文档序列时,可以提供更好的校正功能,并且可以使用可选的校验和比较来验证再现序列的准确性是否正确。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号