Urdu-Text: A Dataset and Benchmark for Urdu Text Detection and Recognition in Natural Scenes

机译：乌尔都语文本：自然场景中乌尔都语文本检测和识别的数据集和基准

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multi-lingual text in natural scene images conveys useful information and is a fundamental tool for tourists to interact with their environment. Multi-lingual text detection and recognition in natural scenes, therefore, has become a challenging problem for researchers in the last few years. Recently, a large-scale multi-lingual dataset for scene text detection and script identification is published by the ICDAR which, contains scene images with text in six different scripts including Arabic. This paper presents a novel dataset and benchmark for Urdu text in natural scenes. Currently, no dataset for Urdu text in natural scenes is publicly available. Urdu is a type of cursive language, which is derived from Arabic script and uses many similar alphabet characters. Therefore, the proposed dataset could be helpful for multi-lingual text detection, recognition and script identification. The aim of this dataset is to help the research community for algorithm development and evaluation of Urdu text in natural scenes. The Urdu-Text dataset contains 1400 complete scene images and 8200-segmented words. The images in the dataset contain a broad variety of text instances in multi-orientations with small and large font sizes. The dataset contains ground truths in the form of bounding boxes at the word level, the script of the text and the text-transcription. The performance of three deep neural networks is evaluated to measure the robustness of the Urdu-Text dataset.

机译：自然场景图像中的多语言文字传达了有用的信息，并且是游客与环境互动的基本工具。因此，在过去的几年中，自然场景中的多语言文本检测和识别已成为研究人员面临的难题。最近，ICDAR发布了用于场景文本检测和脚本识别的大规模多语言数据集，其中包含带有六种不同脚本（包括阿拉伯语）的文本的场景图像。本文为自然场景中的乌尔都语文本提供了一个新颖的数据集和基准。当前，没有公开的自然场景中乌尔都语文本的数据集。乌尔都语是一种草书语言，它源自阿拉伯语脚本，并使用许多相似的字母字符。因此，提出的数据集可能有助于多语言文本的检测，识别和脚本识别。该数据集的目的是帮助研究团体在自然场景中进行乌尔都语文本的算法开发和评估。乌尔都语文本数据集包含1400个完整的场景图像和8200个分段的单词。数据集中的图像包含多种多样的具有小和大字体大小的多方向文本实例。数据集包含基本事实，其形式为单词级别的边界框，文本脚本和文本转录。评估了三个深度神经网络的性能，以测量Urdu-Text数据集的鲁棒性。

著录项

来源
《International Conference on Document Analysis and Recognition》|2019年|323-328|共6页
会议地点
作者
Asghar Ali; Mark Pickering;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Text recognition; Training; Image recognition; Testing; Feature extraction; Tools; Image segmentation;

机译：文本识别;培训;图像识别;测试;特征提取;工具;图像分割;
入库时间 2022-08-26 15:04:11

相似文献

外文文献
中文文献
专利

1. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [J] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, Data in Brief . 2020,第3期

机译：Cursive-Text：自然场景图像中的端到端核心文本识别的全面数据集
2. "Bend the truth": Benchmark dataset for fake news detection in Urdu language and its evaluation [J] . Amjad Maaz, Sidorov Grigori, Zhila Alisa, Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第2Pta2期

机译：“弯曲真相”：乌尔都语语言中假新闻检测的基准数据集及其评估
3. Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks [J] . Xianyu Wu, Chao Luo, Qian Zhang, Computers, Materials & Continua . 2019,第1期

机译：使用深卷积神经网络的自然场景图像的文本检测与识别
4. Urdu-Text: A Dataset and Benchmark for Urdu Text Detection and Recognition in Natural Scenes [C] . Asghar Ali, Mark Pickering International Conference on Document Analysis and Recognition . 2019

机译：Urdu-text：自然场景中的URDU文本检测和识别的数据集和基准
5. Unified detection and recognition for reading text in scene images [D] . Weinman, Jerod J. 2008

机译：统一检测和识别以读取场景图像中的文本
6. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [O] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, 2020

机译：草书文本：用于自然场景图像中端到端乌尔都语文本识别的综合数据集
7. Urdu-Text Detection and Recognition in Natural Scene Images Using Deep Learning [O] . Syed Yasser Arafat, Muhammad Javed Iqbal 2020

机译：使用深度学习的自然场景图像中的乌尔都语文本检测与识别
8. Text Detection and Translation from Natural Scenes [R] . Gao, J. , Yang, J. , Zhang, Y. , 2001

机译：自然场景中的文本检测与翻译

Urdu-Text: A Dataset and Benchmark for Urdu Text Detection and Recognition in Natural Scenes

摘要

著录项

相似文献

相关主题

期刊订阅