Public domain optical character recognition

机译：公共领域光学字符识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Abstract: A public domain document processing system has been developed by the National Institute of Standards and Technology (NIST). The system is a standard reference form-based handprint recognition system for evaluating optical character recognition (OCR), and it is intended to provide a baseline of performance on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system recognizes the handprint entered on handwriting sample forms like the ones distributed with NIST Special Database 1. From these forms, the system reads hand-printed numeric fields, upper and lowercase alphabetic fields, and unconstrained text paragraphs comprised of words from a limited-size dictionary. The modular design of the system makes it useful for component evaluation and comparison, training and testing set validation, and multiple system voting schemes. The system contains a number of significant contributions to OCR technology, including an optimized probabilistic neural network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. The source code for the recognition system is written in C and is organized into 11 libraries. In all, there are approximately 19,000 lines of code supporting more than 550 subroutines. Source code is provided for form registration, form removal, field isolation, field segmentation, character normalization, feature extraction, character classification, and dictionary-based postprocessing. The recognition system has been successfully compiled and tested on a host of UNIX workstations. This paper gives an overview of the recognition system's software architecture, including descriptions of the various system components along with timing and accuracy statistics. !27

机译：摘要：美国国家标准技术研究院（NIST）已开发了一种公共领域的文档处理系统。该系统是用于评估光学字符识别（OCR）的基于标准参考表格的手印识别系统，旨在为开放应用程序提供性能基准。系统的源代码，培训数据，绩效评估工具以及处理的表格类型都是公开可用的。系统会识别在手写示例表格（例如与NIST特殊数据库1一起分发的表格）上输入的手印。系统会从这些表格中读取手写的数字字段，大写和小写字母字段以及不受限制的文字组成的文字段落，大小字典。系统的模块化设计使其可用于组件评估和比较，培训和测试集验证以及多种系统投票方案。该系统对OCR技术做出了许多重大贡献，包括优化的概率神经网络（PNN）分类器，该分类器的运算速度比该算法的传统软件实现快20倍。识别系统的源代码用C编写，并组织为11个库。总共大约有19,000行代码支持550多个子例程。提供了用于表单注册，表单删除，字段隔离，字段分段，字符归一化，特征提取，字符分类和基于字典的后处理的源代码。识别系统已经在许多UNIX工作站上成功编译和测试。本文概述了识别系统的软件体系结构，包括对各种系统组件以及时序和准确性统计信息的描述。！27

著录项

来源
《Document Recognition II》|1995年|p.2-14|共13页
会议地点
作者
Michael D. Garris; National Institute of Standards; Technology; Gaithersburg; MD; USA; James L. Blue; National Institute of Standards; Technology; Gaithersburg; MD; USA; Gerald T. Candela; National Institute of Standards; Technology; Gaithersburg; MD; USA; Darrin L. Dimmick; National Institute of Standards; Technology; Gaithersburg; MD; USA; Jon C. Geist; National Institute of Standards; Technology; Gaithersburg; MD; USA; Patrick J. Grother; National Institute of Standards; Technology; Gaithersburg; MD; USA; Stanley A. Janet; National Institute of Standards; Technology; Gaithersburg; MD; USA; Charles L. Wilson; National Institute of Standards; Technology; Gaithersburg; MD; USA.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. GENETIC ALGORITHM AND NEURAL NETWORK FOR OPTICAL CHARACTER RECOGNITION | Science Publications [J] . Hendy Yeremia, Niko Adrianus Yuwono, Pius Raymond, Journal of computer sciences . 2013,第11期

机译：遗传算法和神经网络的光学字符识别科学出版物
2. Effect of Hidden Layer Neurons on the Classification of Optical Character Recognition Typed Arabic Numerals | Science Publications [J] . Mahmoud Z. Iskandarani, Nidal F. Shilbayeh Journal of computer sciences . 2008,第7期

机译：隐层神经元对光学字符识别阿拉伯数字分类的影响科学出版物
3. Optical Character Recognition System for Arabic Text Using Cursive Multi-Directional Approach | Science Publications [J] . Jamil Ahmad, Mansoor Al-Aali Journal of computer sciences . 2007,第7期

机译：使用草书多方向方法的阿拉伯文字光学字符识别系统科学出版物
4. Public domain optical character recognition [C] . Michael D. Garris, James L. Blue, Gerald T. Candela, Conference on Document Recognition . 1995

机译：公共领域光学字符识别
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. A Real-Time Automatic Plate Recognition System Based on Optical Character Recognition and Wireless Sensor Networks for ITS [O] . Nicole do Vale Dalarmelina, Marcio Andrey Teixeira, Rodolfo I. Meneguette 2020

机译：基于光学字符识别和无线传感器网络的ITS实时自动车牌识别系统
7. Form Recognition dan Character Mapping Menggunakan Image Segmentation dan Optical Character Recognition [O] . Christian Wibisono, Setia Budi 2021

机译：表格识别丹字符映射Menggunakan图像分割Dan光学字符识别
8. Optical Character Recognition (OCR) Inks. Category: Hardware Standard. Subcategory: Character Recognition [R] . Owen, R. K. 1980

机译：光学字符识别（OCR）油墨。类别：硬件标准。子类别：字符识别

Public domain optical character recognition

摘要

著录项

相似文献

相关主题

期刊订阅