Improving Optical Character Recognition through Efficient Multiple System Alignment

机译：通过高效的多系统对准改善光学字符识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Individual optical character recognition (OCR) engines vary in the types of errors they commit in recognizing text, particularly poor quality text. By aligning the output of multiple OCR engines and taking advantage of the differences between them, the error rate based on the aligned lattice of recognized words is significantly lower than the individual OCR word error rates. This lattice error rate constitutes a lower bound among aligned alternatives from the OCR output. Results from a collection of poor quality mid-twentieth century typewritten documents demonstrate an average reduction of 55.0% in the error rate of the lattice of alternatives and a realized word error rate (WER) reduction of 35.8% in a dictionary-based selection process. As an important precursor, an innovative admissible heuristic for the A* algorithm is developed, which results in a significant reduction in state space exploration to identify all optimal alignments of the OCR text output, a necessary step toward the construction of the word hypothesis lattice. On average 0.0079% of the state space is explored to identify all optimal alignments of the documents.

机译：各个光学字符识别（OCR）引擎在识别文本（尤其是质量较差的文本）时所犯的错误类型有所不同。通过对齐多个OCR引擎的输出并利用它们之间的差异，基于对齐的已识别单词晶格的错误率显着低于单个OCR单词错误率。此晶格错误率构成了OCR输出的对齐替代项的下限。收集自20世纪中叶质量不佳的打字文档的结果表明，在基于字典的选择过程中，替代格式的错误率平均降低了55.0％，实现的单词错误率（WER）降低了35.8％。作为重要的先驱，针对A *算法开发了一种创新的可允许启发式算法，它大大减少了状态空间探索以识别OCR文本输出的所有最佳对齐方式，这是构建单词假设格的必要步骤。平均探索状态空间的0.0079％，以识别文档的所有最佳对齐方式。

著录项

来源
《9th ACM/IEEE joint conference on digital libraries 2009》|2009年|P.231-240|共10页
会议地点 Austin TX(US);Austin TX(US)
作者
William B. Lund; rnEric K. Ringger;
展开▼
作者单位

Harold B. Lee Library and the Department of Computer Science Brigham Young University 2060 Lee Library Provo, Utah 84602, USA;

rnDepartment of Computer Science Brigham Young University 3368 Talmage Building Provo, Utah 84602, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类电子图书馆、数字图书馆;
关键词
A~* algorithm; text alignment; OCR error rate reduction;

机译：A〜*算法；文字对齐； OCR错误率降低;
入库时间 2022-08-26 14:06:11

相似文献

外文文献
中文文献
专利

1. Efficient mobile phone Chinese optical character recognition systems by use of heuristic fuzzy rules and bigram Markov language models [J] . Adrian David Cheok, Zhang Jian, Eng Siong Chng Applied Soft Computing . 2008,第2期

机译：启发式模糊规则和二元马尔可夫语言模型的高效手机中文光学字符识别系统
2. Technical Methods and Algorithms for Developing Efficient Optical Character Recognition System: An Overview [J] . Yakubu A. Ibrahim, Tunji S. Ibiyemi Annals. Computer Science Series . 2018,第2期

机译：开发高效光学字符识别系统的技术方法和算法：概述
3. Simple and Efficient Method for Region of Interest Value Extraction from Picture Archiving and Communication System Viewer with Optical Character Recognition Software and Macro Program [J] . Lee Young Han, Park Eun Hae, Suh Jin-Suck Academic radiology . 2015,第1期

机译：使用光学字符识别软件和宏程序从图片存档和通信系统查看器中提取感兴趣区域值的简单有效方法
4. Improving optical character recognition through efficient multiple system alignment [C] . William B. Lund, Eric K. Ringger ACM/IEEE-CS joint conference on Digital libraries . 2009

机译：通过有效的多系统对准来改善光学字符识别
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. A Real-Time Automatic Plate Recognition System Based on Optical Character Recognition and Wireless Sensor Networks for ITS [O] . Nicole do Vale Dalarmelina, Marcio Andrey Teixeira, Rodolfo I. Meneguette 2020

机译：基于光学字符识别和无线传感器网络的ITS实时自动车牌识别系统
7. An Efficient FPGA Implementation of Optical Character Recognition System for License Plate Recognition [O] . Jing Yuan 2016

机译：用于车牌识别的光学字符识别系统的高效FPGA实现
8. Optical Character Recognition for Automated Cartography: The Advanced Development Handprinted Symbol Recognition System. [R] . Brown, R. M., Cheng, C. F. 1983

机译：自动制图的光学字符识别：高级开发手写符号识别系统。

Improving Optical Character Recognition through Efficient Multiple System Alignment

摘要

著录项

相似文献

相关主题

期刊订阅