首页> 外文期刊>Pattern recognition letters >Reduced annotation based on deep active learning for arabic text detection in natural scene images
【24h】

Reduced annotation based on deep active learning for arabic text detection in natural scene images

机译:基于深度主动学习的自然场景图像中阿拉伯语文本检测的简化标注

获取原文
获取原文并翻译 | 示例

摘要

Providing labeled Arabic text images dataset for scene text detection is inherently difficult and costly at the same time. Consequently, only few small datasets are available for this task. Previous work has only focused on the data augmentation technique of small datasets; however, the images generated with these techniques cannot reproduce the complexity and variability of natural images. In this paper, we propose a new Arabic text images dataset using the Google Street View service named Tunisia Street View Dataset (TSVD). The dataset contains 7k images collected from different Tunisian cities. It is much more diverse and complex than current image datasets. Taking advantage of this dataset to train Convolutional Neural Network (CNN) models, annotation is required for building high performance models. The annotation task consumes a lot of time and effort for researchers due to its repetitiveness. The development time of text detection systems in natural images is valuable with an effective use. We believe that we have developed a Deep Active Learning algorithm for the annotation phase. A Deep Active Learning algorithm for the annotation phase has been developed by approaching the annotation suggestion task using a deep learning text detector. CNN are used to perform the text detection in natural scene images. Our deep active learning framework combines CNN and active learning approach. This reduces annotation effort by making pertinent suggestions on the most effective annotation areas. We utilize uncertainty provided by CNN models to determine the maximum uncertain areas for annotation. Deep active learning is shown in order to reduce significantly the number of training samples required and also to minimize the annotation work of our dataset up to 1/5. Our dataset is publicly available in IEEE DataPort https: //dx.doi.org/10.21227/extw-0k60 . (c) 2022 Elsevier B.V. All rights reserved.
机译:为场景文本检测提供带标签的阿拉伯文本图像数据集本身就很困难,同时成本也很高。因此,只有少数小型数据集可用于此任务。以前的工作只集中在小数据集的数据增强技术上;然而,使用这些技术生成的图像无法再现自然图像的复杂性和可变性。在本文中,我们提出了一种新的阿拉伯语文本图像数据集,该数据集使用谷歌街景服务,名为突尼斯街景数据集(TSVD)。该数据集包含从突尼斯不同城市收集的 7k 张图像。它比当前的图像数据集更加多样化和复杂。利用该数据集来训练卷积神经网络 (CNN) 模型,构建高性能模型需要注释。由于其重复性,注释任务会消耗研究人员的大量时间和精力。自然图像中文本检测系统的开发时间对于有效利用是很有价值的。我们相信我们已经为注释阶段开发了一种深度主动学习算法。通过使用深度学习文本检测器处理注释建议任务,开发了一种用于注释阶段的深度主动学习算法。CNN用于在自然场景图像中执行文本检测。我们的深度主动学习框架结合了 CNN 和主动学习方法。这通过对最有效的注释区域提出相关建议来减少注释工作。我们利用 CNN 模型提供的不确定性来确定注释的最大不确定区域。展示了深度主动学习,以显着减少所需的训练样本数量,并将数据集的注释工作降至 1/5。我们的数据集在 IEEE DataPort https 中公开提供://dx.doi.org/10.21227/extw-0k60 。(c) 2022 年爱思唯尔 B.V.保留所有权利。

著录项

  • 来源
    《Pattern recognition letters》 |2022年第5期|42-48|共7页
  • 作者单位

    Univ Sfax, Natl Engn Sch Sfax ENIS, Res Grp Intelligent Machines REGIM Lab, BP 1173, Sfax 3038, Tunisia;

    Taif Univ, Dept Comp Sci, Coll Comp & Informat Technol, POB 11099, At Taif 21944, Saudi Arabia;

    King Saud Univ, Coll Appl Comp Sci, Riyadh, Saudi ArabiaUniv Sfax, Natl Engn Sch Sfax ENIS, Res Grp Intelligent Machines REGIM Lab, BP 1173, Sfax 3038, Tunisia|Univ Johannesburg, Dept Elect & Elect Engn Sci, Fac Engn & Built Environm, Johannesburg, South Africa;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 英语
  • 中图分类
  • 关键词

    Active learning; Deep learning; Annotation; Natural scene images; Text detection;

    机译:主动学习;深度学习;注释;自然场景图片;文本检测;
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号