首页> 外文会议>International Mobile, Intelligent, and Ubiquitous Computing Conference >OCFormer: A Transformer-Based Model For Arabic Handwritten Text Recognition
【24h】

OCFormer: A Transformer-Based Model For Arabic Handwritten Text Recognition

机译:ocformer:阿拉伯语手写文本识别的基于变压器的模型

获取原文

摘要

The Optical Character Recognition (OCR) of Arabic historical documents is a challenging task. The reason being the complexity of the layout and the highly variant typography. Nonetheless, in recent years, with the rise of Deep learning, significant progress has been made in historical OCR; in both layout recognition and segmentation, and also in character recognition. The only downside is the limited advancements dedicated to the Arabic language, notably the handwritten text. In this paper, we present an OCR approach that utilizes state-of-theart Deep learning techniques for the Arabic language. We built a custom dataset of obfuscated and noisy images to imitate the noise in historical Arabic documents, with a collection of 30 million images paired with their ground truth. The model utilizes both page segmentation and line segmentation techniques to enhance the resultant transcription. The model is complex enough for transcribing handwritten manuscripts. In addition, the model can detect and transcribe documents that contain Arabic diacritics. The model attained a CER of 0.0727, a WER of 0.0829, and a SER of 0.10.
机译:阿拉伯语历史文献的光学字符识别(OCR)是一个具有挑战性的任务。原因是布局的复杂性和高度变体排版。尽管如此,在近年来,随着深度学习的兴起,历史OCR中取得了重大进展;在布局识别和分段中,以及在字符识别中。唯一的缺点是致力于阿拉伯语的有限进步,特别是手写文本。在本文中,我们介绍了一种用于阿拉伯语的最终学习技术的OCR方法。我们构建了一个混淆和嘈杂的图像的自定义数据集,以模仿历史阿拉伯文文件中的噪音,其中包含3000万个图像与他们的实际真相配对。该模型利用页面分段和线分割技术来增强所得转录。该模型足以转换手写手稿的复杂性。此外,该模型可以检测和转录含有阿拉伯语变量的文档。该模型达到0.0727的CER,WER为0.0829,SER为0.10。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号