首页> 外文会议>Document Recognition III >Document-specific character template estimation
【24h】

Document-specific character template estimation

机译:特定于文档的字符模板估计

获取原文

摘要

Abstract: An approach to supervised training of document-specific character templates from sample page images and unaligned transcriptions is presented. The template estimation problem is formulated as one of constrained maximum likelihood parameter estimation within the document image decoding (DID) framework. This leads to a two-phase iterative training algorithm consisting of transcription alignment and aligned template estimation (ATE) steps. The ATE step is the heart of the algorithm and involves assigning template pixel colors to maximize likelihood while satisfying a template disjointedness constraint. The training algorithm is demonstrated on a variety of English documents, including newspaper columns, 15th century books, degraded images of 19th century newspapers, and connected text in a script-like font. Three applications enabled by the training procedure are described - high accuracy document-specific decoding, transcription error visualization and printer font generation. !14
机译:摘要:提出了一种从样本页面图像和未对齐转录中监督训练特定文档字符模板的方法。模板估计问题被公式化为文档图像解码(DID)框架内受约束的最大似然参数估计之一。这导致了一个两阶段的迭代训练算法,该算法包括转录比对和比对模板估计(ATE)步骤。 ATE步骤是算法的核心,涉及分配模板像素颜色以在满足模板不相交约束的同时最大程度地提高似然度。该训练算法在各种英文文档中得到了证明,包括报纸专栏,15世纪的书籍,19世纪报纸的退化图像以及类似脚本的字体连接文本。描述了培训程序支持的三个应用程序-高精度文档特定的解码,转录错误可视化和打印机字体生成。 !14

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号