首页> 外文会议>ICPR 2012;International Conference on Pattern Recognition >Font identification — In context of an Indic script
【24h】

Font identification — In context of an Indic script

机译:字体识别—在印度语脚本中

获取原文

摘要

Font can be used as a notion of similarity amongst multiple documents written in same script. We could automatically retrieve document images with specific font from a huge digital document repository. So Optical Font Recognition could be a useful pre-processing step in an automated questioned document analysis system for sorting documents with similar fonts. We propose a scheme to identify 10 different fonts for an Indic script (Bangla). Curvature-based features are extracted from segmented characters and are fed to a Support Vector Machine (SVM) classifier. The classifier determines the font type for each segmented character obtained from a document. Later, font identification for that document is executed on the basis of majority voting amongst 10 different fonts for all characters. Using a Multiple Kernel SVM classifier we obtained 98.5% accuracy from 400 test documents (40 documents for each font type).
机译:可以将字体用作用同一脚本编写的多个文档之间的相似性概念。我们可以从庞大的数字文档存储库中自动检索具有特定字体的文档图像。因此,光学字体识别可能是自动查询文档分析系统中有用的预处理步骤,用于对具有相似字体的文档进行排序。我们提出了一种方案来为印度文字(Bangla)识别10种不同的字体。从分段字符中提取基于曲率的特征,并将其馈入支持向量机(SVM)分类器。分类器确定从文档获得的每个分段字符的字体类型。后来,基于所有字符的10种不同字体中的多数投票,对该文档进行字体识别。使用多核SVM分类器,我们从400个测试文档(每种字体类型40个文档)中获得了98.5%的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号