Improving post-processing optical character recognition documents with Arabic language using spelling error detection and correction

lyad Abu Doush; Ahmed M. Al-Trad

首页> 外文期刊>International journal of reasoning-based intelligent systems >Improving post-processing optical character recognition documents with Arabic language using spelling error detection and correction

【24h】

Improving post-processing optical character recognition documents with Arabic language using spelling error detection and correction

机译：使用拼写错误检测和更正来改进阿拉伯语后处理光学字符识别文档

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The optical character recognition (OCR) is used to convert scanned documents into text. The resulted text need to be validated for correctness. The problem increased when working on Arabic text because of the complexity of Arabic language. This research aims to explore the ways of improving OCR spell checking effectiveness by proposing a post-processing Arabic OCR system based on three different approaches: Microsoft Office Word with Google online suggestion system, Ayaspell spell checker with Google online suggestion system, and using Google online suggestion system alone. We have used precision and recall in order to evaluate the effectiveness of our proposed OCR post-processing. The results show that using Microsoft Office Word with Google outperform other approaches with accuracy of (0.49).

机译：光学字符识别（OCR）用于将扫描的文档转换为文本。需要验证结果文本的正确性。由于阿拉伯语的复杂性，在处理阿拉伯语文本时，问题更加严重。这项研究的目的是通过提出一种基于三种不同方法的后处理阿拉伯语OCR系统来探索提高OCR拼写检查有效性的方法：Microsoft Office Word和Google在线建议系统，Ayaspell拼写检查器与Google在线建议系统以及Google在线使用建议系统。我们使用精度和召回率来评估建议的OCR后处理的有效性。结果表明，将Microsoft Office Word与Google结合使用时，其准确性优于（0.49）的其他方法。

著录项

来源
《International journal of reasoning-based intelligent systems》 |2016年第4期|91-103|共13页
作者
lyad Abu Doush; Ahmed M. Al-Trad;
展开▼
作者单位

Computer Science Department, Yarmouk University, Irbid, Jordan;

Computer Science Department, Yarmouk University, Irbid, Jordan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
post-processing Arabic OCR; Arabic optical character recognition; Arabic spell checker;

机译：后处理阿拉伯语OCR;阿拉伯文光学字符识别;阿拉伯语拼写检查器;

相似文献

外文文献
中文文献
专利

1. Spelling correction for the Arabic language space deletion errors- [J] . Yousfi Abdellah, Aouragh Si Lhoussain, Gueddah Hicham, Procedia Computer Science . 2020,第5期

机译：阿拉伯语空间删除错误的拼写纠正 -
2. Arabic spelling error detection and correction [J] . MOHAMMED ATTIA, PAVEL PECINA, YOUNES SAMIH, Natural language engineering . 2016,第pta5期

机译：阿拉伯语拼写错误检测和更正
3. An Interactive Open-vocabulary Chinese Name Input System Using Syllable Spelling And Character Description Recognition Modules For Error Correction [J] . Nick Jui Chang WANG IEICE Transactions on Information and Systems . 2007,第11期

机译：基于音节拼写和字符描述识别模块的交互式开放式汉语姓名输入系统，用于纠错
4. QCMUQ@QALB-2015 Shared Task: Combining Character level MT and Error-tolerant Finite-State Recognition for Arabic Spelling Correction [C] . Houda Bouamor, Hassan Sajjad, Nadir Durrani, Workshop on Arabic natural language processing . 2015

机译：QCMUQ @ QALB-2015共享的任务：结合字符级MT和容错的有限状态识别以进行阿拉伯语拼写校正
5. Optical Character Recognition of Printed Persian/Arabic Documents. [D] . Shafii, Mahnaz. 2014

机译：印刷的波斯/阿拉伯文档的光学字符识别。
6. Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research [O] . Laslo Dinges, Ayoub Al-Hamadi, Moftah Elzobi, 2016

机译：常用阿拉伯文字的合成对光学字符识别的研究
7. QCMUQ@QALB-2015 Shared Task: Combining Character level MT and Error-tolerant Finite-State Recognition for Arabic Spelling Correction [O] . Houda Bouamor, Hassan Sajjad, Nadir Durrani, 2015

机译：QCmUQ @ QaLB-2015共享任务：将字符级mT和容错有限状态识别结合起来进行阿拉伯语拼写校正
8. Foreign Language Optical Character Recognition, Phase II: Arabic and PersianTraining and Test Data Sets [R] . Davidson, R. B., Hopely, R. L. 1997

机译：外语光学字符识别，第二阶段：阿拉伯语和波斯语培训和测试数据集

Improving post-processing optical character recognition documents with Arabic language using spelling error detection and correction

摘要

著录项

相似文献

相关主题

期刊订阅