Autonomously normalized horizontal differentials as features for HMM-based Omni font-written OCR systems for cursively scripted languages

机译：自主标准化的水平差作为草书语言的基于HMM的Omni字体编写的OCR系统的功能

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic font-written Optical Character Recognition (OCR) is highly desirable for numerous modern information technology (IT) applications. Reliable font-written OCR''s for Latin scripts are readily in use since long. For cursively scripted languages, that are the mother tongues of over one fourth of the world population, such OCR''s are however not available at a robust and reliable performance. In this regard, the main challenge is the mandatory connectivity of characters/ligatures (i.e. graphemes) that has to be resolved simultaneously upon the recognition of these graphemes. Among the various approaches tried over decades, Hidden Markov Models (HMM)-based OCR''s seem to be the most promising as they capitalize on the ability of HMM decoders to achieve segmentation and recognition simultaneously similar to the widely used HMM-based automatic speech recognition (ASR). Unlike ASR''s, what is missing in HMM-based OCR''s is the definition of a rigorously founded features vector capable to robustly achieving minimal “font type/size-independent” (omnifont) word error rates comparable to those realized with Latin scripts. Here comes the contribution of this paper that introduces such a sound features vector design, and experimentally shows its superiority in this regard.

机译：自动字体写的光学字符识别（OCR）是众多现代信息技术（IT）应用程序中非常需要的。长期以来，可靠的用于拉丁文字的字体书写OCR一直在使用中。对于草书脚本语言来说，它们是世界四分之一以上人口的母语，但是，此类OCR并不能以强大而可靠的性能提供。在这方面，主要挑战是字符/连字（即字素）的强制性连接，这些连接必须在识别这些字素时同时解决。在数十年来尝试的各种方法中，基于隐马尔可夫模型（HMM）的OCR似乎最有前途，因为它们利用了HMM解码器同时实现分段和识别的能力，类似于广泛使用的基于HMM的自动语音识别（ASR）。与ASR不同，基于HMM的OCR中缺少的是严格建立的特征向量的定义，该特征向量能够可靠地实现最小的“字体类型/大小无关”（全字体）字错误率，与之相比，拉丁文字。本文介绍了这种声音特征矢量设计，并通过实验证明了其在这方面的优越性。

著录项

来源
《2009 IEEE International Conference on Signal and Image Processing Applications》|2009年|P.185-190|共6页
会议地点 Kuala Lumpur(MY);Kuala Lumpur(MY)
作者
Attia Mohamed; Rashwan Mohsen A. A.; El-Mahallawy Mohamed S. M.;
展开▼
作者单位

The Engineering Company for the Development of Computer Systems;

RDI, Egypt;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. USING A STATISTICAL LANGUAGE MODEL TO IMPROVE THE PERFORMANCE OF AN HMM-BASED CURSIVE HANDWRITING RECOGNITION SYSTEM [J] . U.-V. MARTI, H. BUNKE International Journal of Pattern Recognition and Artificial Intelligence . 2001,第1期

机译：使用统计语言模型提高基于HMM的手写输入识别系统的性能
2. ARABIC OCR SYSTEM ANALOGOUS TO HMM-BASED ASR SYSTEMS; IMPLEMENTATION AND EVALUATION [J] . M. A. RASHWAN, M. W. FAKHR, M. ATTIA, Journal of engineering and applied science . 2007,第6期

机译：基于HMM的ASR系统的阿拉伯OCR系统的类似物；实施与评估
3. ASCII Based GUI System for Arabic Scripted Languages: A Case of Urdu [J] . Rehman Bacha, Halim Zahid, Ahmad Mustaq The international arab journal of information technology . 2014,第4期

机译：基于ASCII的阿拉伯语脚本语言的GUI系统：以Urdu为例
4. Autonomously normalized horizontal differentials as features for HMM-based Omni font-written OCR systems for cursively scripted languages [C] . Institute of Electrical and Electronics Engineers International Conference on Signal and Image Processing Applications . 2009

机译：自动归一化的水平差分作为基于HMM的全部核心字体写入的Cursed脚本语言的OCR系统功能
5. Design and development of an autonomous navigation system for an omni-directional four-wheeled mobile robot. [D] . Ginzburg, Sasha. 2012

机译：全向四轮移动机器人自主导航系统的设计与开发。
6. Unified Medical Language System resources improve sieve-based generation and Bidirectional Encoder Representations from Transformers (BERT)–based ranking for concept normalization [O] . Dongfang Xu, Manoj Gopale, Jiacheng Zhang, 2020

机译：统一的医疗语言系统资源改善了从变压器（BERT）的基于筛的生成和双向编码器表示为概念标准化排名
7. A Front-End OCR for Omni-Font Persian/Arabic Cursive Printed Documents [O] . Ramin Mehran, Hamed Pirsiavash, Farbod Razzazi 2005

机译：Omni-Font波斯语/阿拉伯语草书印刷文件的前端OCR
8. Considerations on Command and Response Language Features for a Network of Heterogeneous Autonomous Computers [R] . Engelberg, N., Shaw, Iii, C. 1984

机译：关于异构自治计算机网络命令和响应语言特性的思考

Autonomously normalized horizontal differentials as features for HMM-based Omni font-written OCR systems for cursively scripted languages

摘要

著录项

相似文献

相关主题

期刊订阅