首页> 外文会议>International Conference on Document Analysis and Recognition >Urdu-Text: A Dataset and Benchmark for Urdu Text Detection and Recognition in Natural Scenes
【24h】

Urdu-Text: A Dataset and Benchmark for Urdu Text Detection and Recognition in Natural Scenes

机译:乌尔都语文本:自然场景中乌尔都语文本检测和识别的数据集和基准

获取原文

摘要

Multi-lingual text in natural scene images conveys useful information and is a fundamental tool for tourists to interact with their environment. Multi-lingual text detection and recognition in natural scenes, therefore, has become a challenging problem for researchers in the last few years. Recently, a large-scale multi-lingual dataset for scene text detection and script identification is published by the ICDAR which, contains scene images with text in six different scripts including Arabic. This paper presents a novel dataset and benchmark for Urdu text in natural scenes. Currently, no dataset for Urdu text in natural scenes is publicly available. Urdu is a type of cursive language, which is derived from Arabic script and uses many similar alphabet characters. Therefore, the proposed dataset could be helpful for multi-lingual text detection, recognition and script identification. The aim of this dataset is to help the research community for algorithm development and evaluation of Urdu text in natural scenes. The Urdu-Text dataset contains 1400 complete scene images and 8200-segmented words. The images in the dataset contain a broad variety of text instances in multi-orientations with small and large font sizes. The dataset contains ground truths in the form of bounding boxes at the word level, the script of the text and the text-transcription. The performance of three deep neural networks is evaluated to measure the robustness of the Urdu-Text dataset.
机译:自然场景图像中的多语言文字传达了有用的信息,并且是游客与环境互动的基本工具。因此,在过去的几年中,自然场景中的多语言文本检测和识别已成为研究人员面临的难题。最近,ICDAR发布了用于场景文本检测和脚本识别的大规模多语言数据集,其中包含带有六种不同脚本(包括阿拉伯语)的文本的场景图像。本文为自然场景中的乌尔都语文本提供了一个新颖的数据集和基准。当前,没有公开的自然场景中乌尔都语文本的数据集。乌尔都语是一种草书语言,它源自阿拉伯语脚本,并使用许多相似的字母字符。因此,提出的数据集可能有助于多语言文本的检测,识别和脚本识别。该数据集的目的是帮助研究团体在自然场景中进行乌尔都语文本的算法开发和评估。乌尔都语文本数据集包含1400个完整的场景图像和8200个分段的单词。数据集中的图像包含多种多样的具有小和大字体大小的多方向文本实例。数据集包含基本事实,其形式为单词级别的边界框,文本脚本和文本转录。评估了三个深度神经网络的性能,以测量Urdu-Text数据集的鲁棒性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号