An attention based method for offline handwritten Urdu text recognition

机译：基于注意力的离线乌尔都语手写文本识别方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Compared to derivatives from Latin script, recognition of derivatives from Arabic handwritten script is a complex task due to the presence of two-dimensional structure, context-dependent shape of characters, high number of ligatures, overlap of characters, and placement of diacritics. While significant attempts exist for Latin and Arabic scripts, very few attempts have been made for offline, handwritten, Urdu script. In this paper, we introduce a large, annotated dataset of handwritten Urdu sentences. We also present a methodology for the recognition of offline handwritten Urdu text lines. A deep learning based encoder /decoder framework with attention mechanism is used to handle two-dimensional text structure. While existing approaches report only character level accuracy, the proposed model improves on BLSTM-based state-of-the-art by a factor of 2 in terms of character level accuracy and by a factor of 37 in terms of word level accuracy. Incorporation of attention before a recurrent decoding framework helps the model in looking at appropriate locations before classifying the next character and therefore results in a higher word level accuracy.

机译：与拉丁文字的派生词相比，阿拉伯文字手写体的派生词识别是一个复杂的任务，这是因为它具有二维结构，与字符有关的上下文形状，大量的连字，字符重叠和变音符号放置。尽管对拉丁文和阿拉伯文脚本进行了大量尝试，但对于脱机手写乌尔都语脚本却进行了很少的尝试。在本文中，我们介绍了一个大型的带注释的手写乌尔都语句子数据集。我们还提出了一种用于识别脱机手写乌尔都语文本行的方法。基于深度学习的具有注意力机制的编码器/解码器框架用于处理二维文本结构。尽管现有方法仅报告字符级精度，但是在基于BLSTM的最新技术方面，提出的模型在字符级精度方面提高了2倍，在单词级精度方面提高了37倍。在循环解码框架之前合并注意力有助于模型在对下一个字符进行分类之前先查找适当的位置，从而提高单词级别的准确性。

著录项

来源
《International Conference on Frontiers in Handwriting Recognition》|2020年|169-174|共6页
会议地点
作者
Tayaba Anjum; Nazar Khan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Voltage control; Handwriting recognition; Convolution;

机译：电压控制;手写识别;卷积;

相似文献

外文文献
中文文献
专利

1. Hidden Markov model-based ensemble methods for offline handwritten text line recognition [J] . Bertolami R, Bunke H Pattern Recognition: The Journal of the Pattern Recognition Society . 2008,第11期

机译：基于隐马尔可夫模型的离线手写文本行识别的集成方法
2. A Bayesian-based method of unconstrained handwritten offline Chinese text line recognition [J] . Nan-Xi Li, Lian-Wen Jin International Journal on Document Analysis and Recognition (IJDAR) . 2013,第1期

机译：基于贝叶斯的无约束手写离线中文文本行识别方法
3. A Bayesian-based method of unconstrained handwritten offline Chinese text line recognition [J] . Nan-Xi Li, Lian-Wen Jin International Journal on Document Analysis and Recognition . 2013,第1期

机译：基于贝叶斯的无约束手写离线中文文本行识别方法
4. Multiple Classifier Methods for Offline Handwritten Text Line Recognition [C] . Roman Bertolami, Horst Bunke International Workshop on Multiple Classifier Systems(MCS 2007); 20070523-25; Prague(CZ) . 2007

机译：离线手写文本行识别的多种分类器方法
5. Neural network based off-line handwritten text recognition system [D] . Han, Changan 2011

机译：基于神经网络的离线手写文本识别系统
6. Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks [O] . Saeeda Naz, Arif Iqbal Umar, Riaz Ahmed, -1

机译：基于多维长短期记忆神经网络的隐式分割的Urdu Nastaliq文本识别
7. Offline Recognition of Handwritten Urdu Characters using B Spline Curves: A Survey [O] . Mohd Jameel, Sanjay Kumar 2017

机译：使用B样条曲线的手写URDU字符的离线识别：调查

An attention based method for offline handwritten Urdu text recognition

摘要

著录项

相似文献

相关主题

期刊订阅