首页> 外文期刊>International Journal of Applied Pattern Recognition >Statistical comparison of classifiers for script identification from multi-script handwritten documents
【24h】

Statistical comparison of classifiers for script identification from multi-script handwritten documents

机译:从多脚本手写文档中识别脚本的分类器的统计比较

获取原文
获取原文并翻译 | 示例
       

摘要

Script identification for handwritten document image is an open document analysis problem especially for multilingual optical character recognition (OCR) system. To design the OCR system for multi-script document pages, it is essential to recognise different scripts before running a particular OCR system of a script. The present work reports an intelligent feature-based technique for word-level script identification in multi-script handwritten document pages. At first, the text lines and then the words are extracted from the document pages. A set of 39 distinctive features have been designed of which eight features are topological and the rest (31) are based on convex hull for each word image. For selection of a suitable classifier, performances of multiple classifiers are evaluated with the designed feature set on multiple subsets of freely available database CMATERdbl.5.1, which comprises of 150 handwritten document pages containing both Devnagari and Roman script words. Statistical significance tests on these performance measures declare MLP to be the best performing one. The overall word-level script identification accuracy with MLP classifier on the said database is observed as 99.74%.
机译:手写文档图像的脚本识别是一个开放文档分析问题,尤其是对于多语言光学字符识别(OCR)系统。为了设计用于多脚本文档页面的OCR系统,必须在运行脚本的特定OCR系统之前识别不同的脚本。本工作报告了一种基于智能特征的技术,用于在多脚本手写文档页面中进行单词级脚本识别。首先,从文档页面提取文本行,然后提取单词。已设计出39个独特特征集,其中8个具有拓扑特征,其余特征(31)基于每个单词图像的凸包。为了选择合适的分类器,使用可自由使用的数据库CMATERdbl.5.1的多个子集上的设计功能集评估多个分类器的性能,该数据库包含150个包含Devnagari和Roman脚本词的手写文档页面。对这些绩效指标的统计显着性检验表明,MLP是绩效最好的指标。在所述数据库上使用MLP分类器的整体单词级脚本识别准确性被观察为99.74%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号