首页> 外文会议>2013 International Conference on Electrical Engineering and Software Applications >OCR-independent and segmentation-free word-spotting in handwritten Arabic Archive documents
【24h】

OCR-independent and segmentation-free word-spotting in handwritten Arabic Archive documents

机译:手写阿拉伯档案文件中与OCR无关且无分段的单词发现

获取原文
获取原文并翻译 | 示例

摘要

In this paper, a word-spotting approach is presented that can help in reading handwritten Arabic Archive Documents. Because of the low quality of these documents, the proposed approach is free segmentation, independent of OCR, using a global transformation of word images. It is a based learning approach which employs Generalized Hough Transform (GHT) technique. It detects words, described by their models, in documents images by finding the model's position in the image. With the GHT, the problem of finding the model's position is transformed to a problem of finding the transformation's parameter that maps the model into the image. Parameters such as Hough threshold and distance between voting points are considered for a better location and recognition of words. We tested our system on registers from the 19th century onwards, held in the National Archives of Tunisia. Our first experiments reach an average of 94% of well-spotted words.
机译:本文提出了一种点字方法,可以帮助阅读手写的阿拉伯档案文件。由于这些文档的质量较低,因此建议的方法是使用词图像的全局转换来进行独立于OCR的自由分割。这是一种基于学习方法,采用了广义霍夫变换(GHT)技术。它通过查找模型在图像中的位置来检测文档图像中由其模型描述的单词。使用GHT,将查找模型位置的问题转换为查找将模型映射到图像的变换参数的问题。为了更好地定位和识别单词,考虑了诸如霍夫阈值和投票点之间的距离之类的参数。我们从19世纪开始在突尼斯国家档案馆中的寄存器上测试了我们的系统。我们的第一个实验平均可以找到94%的正确单词。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号