首页> 外文会议>MMEDIA 2012 >A Database of Artificial Urdu Text in Video Images with Semi-Automatic Text Line Labeling Scheme
【24h】

A Database of Artificial Urdu Text in Video Images with Semi-Automatic Text Line Labeling Scheme

机译:半自动文本标记方案视频图像中的人工Urdu文本数据库

获取原文

摘要

This paper describes a novel database of video images containing artificial (superimposed) Urdu text with a semi-automatic text line labeling scheme. The main objective of this study is to provide the community with a standard dataset together with an auto-labeling scheme for algorithmic development and evaluation of textual content based indexing and retrieval systems. We have specifically focused on Urdu text which is increasingly gaining research interest in recent years. The data set comprises 1000 video images collected from 19 different channels of 5 different categories. An attempt is made to capture the maximum possible variation in the text in terms of size, location, appearance and background. The data set is completely labeled by finding the bounding rectangle of each text occurrence facilitating the evaluation of text detection and localization systems. Based on our previous work on text localization, an automatic text labeling scheme is also proposed and the obtained results are compared with manual labeling. Ground truth data, supporting tasks like text recognition and word spotting will be considered in the next version of the data set.
机译:本文介绍了包含具有半自动文本标记方案的人工(叠加)Urdu文本的视频图像的新数据库。本研究的主要目的是将社区与标准数据集一起提供与基于文本内容的索引和检索系统的算法开发和评估的自动标记方案。我们专门专注于乌尔都语文本,近年来越来越多地获得研究兴趣。数据集包括从19个不同类别的19个不同频道收集的1000个视频图像。尝试在大小,位置,外观和背景中捕获文本中的最大可能变化。通过查找促进文本检测和本地化系统的评估,通过查找每个文本发生的界定矩形来完全标记数据集。基于我们之前的文本定位的工作,还提出了一种自动文本标签方案,并将获得的结果与手动标签进行比较。地面真理数据,支持文本识别和单词发现等任务将在数据集的下一个版本中考虑。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号