...
首页> 外文期刊>International Journal on Document Analysis and Recognition (IJDAR) >Comprehensive synthetic Arabic database for on/off-line script recognition research
【24h】

Comprehensive synthetic Arabic database for on/off-line script recognition research

机译:全面的合成阿拉伯数据库,用于在线/离线脚本识别研究

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Developing and maintaining large comprehensive databases for script recognition that include different shapes for each word in the lexicon is expensive and difficult. In this paper, we present an efficient system that automatically generates prototypes for each word in a lexicon using multiple appearances of each letter. Large sets of different shapes are created for each letter in each position. These sets are then used to generate valid shapes for each word-part. The number of valid permutations for each word is large and prohibits practical training and searching for various tasks, such as script recognition and word spotting. We apply dimensionality reduction and clustering techniques to maintain compact representation of these databases, without affecting their ability to represent the wide variety of handwriting styles. In addition, a database for off-line script recognition is generated from the on-line strokes using a standard dilation technique, while making special efforts to resemble pen’s path. We also examined and used several layout techniques for producing words from the generated word-parts. Our experimental results show that the proposed system can automatically generate large databases, whose quality is at least as good as the manually generated ones.
机译:开发和维护用于脚本识别的大型综合数据库,其中包含词典中每个单词的不同形状,既昂贵又困难。在本文中,我们提出了一个高效的系统,该系统使用每个字母的多个外观自动为词典中的每个单词生成原型。在每个位置为每个字母创建大套不同形状的图形。这些集合然后用于为每个单词部分生成有效的形状。每个单词的有效排列数量很大,并且禁止进行实际训练和搜索各种任务,例如脚本识别和单词识别。我们应用降维和聚类技术来维护这些数据库的紧凑表示,而不会影响它们代表多种手写风格的能力。此外,还使用标准的扩散技术从在线笔划中生成了用于脱机脚本识别的数据库,同时做出了与笔的路径相似的特殊工作。我们还检查并使用了几种布局技术,从生成的单词部分生成单词。我们的实验结果表明,所提出的系统可以自动生成大型数据库,其质量至少与手动生成的数据库一样好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号