首页> 外文会议>International conference on image processing, computer vision, pattern recognition >Document image segmentation by a cascade of Pseudo-Word contextual labelings
【24h】

Document image segmentation by a cascade of Pseudo-Word contextual labelings

机译:通过一系列伪单词上下文标签对文档图像进行分割

获取原文

摘要

The aim of this work is to classify the document content in handwritten (H), printed (P) and noise (N). In a first step, based on smearing, writing pseudo-lines and pseudo-words are extracted. The latters are classified in (P, H, N) using SVM with a Gaussian kernel. In a second step, the context of pseudo-words is examined along their pseudo-lines, spread the type of script and correct errors. First, the word separation is modeled by a conditional random field Then, the context is extended using a cascade of contextual propagation modules. Our system achieves a very good pseudo-word classification rate for both handwritten and printed text (97.3% and 99.5% respectively) for a total of 98.7%.
机译:这项工作的目的是将文档内容分类为手写(h),印刷(p)和噪声(n)。在第一步中,基于涂抹,提取伪线和伪词。使用带有高斯内核的SVM分类在(p,h,n)中分类为(p,h,n)。在第二步中,沿着它们的伪线路检查伪词的上下文,传播脚本的类型和正确的错误。首先,单词分离由条件随机字段建模,然后,使用级联的上下文传播模块扩展上下文。我们的系统为手写和印刷文本(分别为97.3%和99.5%)实现了非常好的伪词分类率,共计98.7%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号