OCR-independent and segmentation-free word-spotting in handwritten Arabic Archive documents

机译：手写阿拉伯档案文件中与OCR无关且无分段的单词发现

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, a word-spotting approach is presented that can help in reading handwritten Arabic Archive Documents. Because of the low quality of these documents, the proposed approach is free segmentation, independent of OCR, using a global transformation of word images. It is a based learning approach which employs Generalized Hough Transform (GHT) technique. It detects words, described by their models, in documents images by finding the model's position in the image. With the GHT, the problem of finding the model's position is transformed to a problem of finding the transformation's parameter that maps the model into the image. Parameters such as Hough threshold and distance between voting points are considered for a better location and recognition of words. We tested our system on registers from the 19th century onwards, held in the National Archives of Tunisia. Our first experiments reach an average of 94% of well-spotted words.

机译：本文提出了一种点字方法，可以帮助阅读手写的阿拉伯档案文件。由于这些文档的质量较低，因此建议的方法是使用词图像的全局转换来进行独立于OCR的自由分割。这是一种基于学习方法，采用了广义霍夫变换（GHT）技术。它通过查找模型在图像中的位置来检测文档图像中由其模型描述的单词。使用GHT，将查找模型位置的问题转换为查找将模型映射到图像的变换参数的问题。为了更好地定位和识别单词，考虑了诸如霍夫阈值和投票点之间的距离之类的参数。我们从19世纪开始在突尼斯国家档案馆中的寄存器上测试了我们的系统。我们的第一个实验平均可以找到94％的正确单词。

著录项

来源
《2013 International Conference on Electrical Engineering and Software Applications》|2013年|1-6|共6页
会议地点 Hammamet(TN)
作者
Aouadi; N.; Kacem; A.;
展开▼
作者单位

LaTICE, Research Laboratory of Technology of Information and Communication Electrical Engineering 5, Avenue Taha Hussein, BP 56 Bab Mnara, Tunis, Tunisiac;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Clustering; Generalized Hough Transform; Handwritten Recognition; Historical document; OCR; Word-spotting;

机译：聚类;广义霍夫变换;手写识别;历史文档; OCR;单词识别;;

相似文献

外文文献
中文文献
专利

1. A Novel Word-Spotting Method for Handwritten Documents Using an Optimization-Based Classifier [J] . Tavoli Reza, Keyvanpour Mohammadreza Applied Artificial Intelligence . 2017,第4a6期

机译：基于优化的分类器用于手写文档的新单词发现方法
2. Segmentation-free word spotting in historical Bangla handwritten document using Wave Kernel Signature [J] . Pattern Analysis and Applications . 2020,第2期

机译：使用Wave Kernel签名在孟加拉国历史手写文档中实现无分割的单词识别
3. Access By Content To Handwritten Archive Documents: Generic Document Recognition Method And Platform For Annotations [J] . Bertrand Coueasnon, Jean Camillerapp, Ivan Leplumey International Journal on Document Analysis and Recognition . 2007,第2a4期

机译：通过内容访问手写存档文档：通用文档识别方法和注释平台
4. OCR-independent and Segmentation-free Word-Spotting in Handwritten Arabic Archive Documents [C] . Aouadi N., Kacem A. International Conference on Electrical Engineering and Software Applications . 2013

机译：在手写的阿拉伯语档案文件中，OCR独立和分割的单词斑点
5. Writer identification of Arabic handwritten documents. [D] . Awaida, Sameh Mohammad. 2011

机译：阿拉伯手写文件的作家身份证明。
6. Novel Deep Convolutional Neural Network-Based Contextual Recognition of Arabic Handwritten Scripts [O] . Rami Ahmed, Mandar Gogate, Ahsen Tahir, 2021

机译：基于新型卷积神经网络的阿拉伯语手写脚本的新型卷积神经网络
7. Segmentation-free Word Spotting for Handwritten Arabic Documents [O] . Ghizlane Khaissidi, Youssef Elfakir, Mostafa Mrabti, 2016

机译：手写阿拉伯文档的无分段Word定位

OCR-independent and segmentation-free word-spotting in handwritten Arabic Archive documents

摘要

著录项

相似文献

相关主题

期刊订阅