首页> 外文会议>International Conference on Document Analysis and Recognition >A new method of character line extraction from mixed-unformatted document image for Japanese mail address recognition
【24h】

A new method of character line extraction from mixed-unformatted document image for Japanese mail address recognition

机译:用于日语邮件地址识别的混合 - 未格式化文档图像的一种新的字符线提取方法

获取原文

摘要

Presents a new method of horizontal and vertical character line extraction in mixed (handwritten/printed) unformatted document images, in various character sizes, gaps and orientations nested among advertisement characters, drawings and photographs. We use the inherent features of a character line, such as the number and size of the characters it contains and the angular spectrum of the characters. When an area has characters along both horizontal and vertical lines, then competitive judgment is applied. Using multi-set thresholds in a bottom-up methodology, we can successfully extract Japanese mail address character lines. 957 address character lines, taken from 252 pieces of mail, were tested, and a 95.9% correct extraction rate was achieved.
机译:呈现了混合(手写/打印)未格式化的文档图像中的水平和垂直字符线提取的新方法,以各种字符尺寸,间隙和嵌套在广告字符,图纸和照片中的方向。我们使用字符行的固有功能,例如它包含的字符的数量和大小以及字符的角谱。当一个区域沿水平和垂直线的角色有角色,然后应用竞争判断。在自下而上的方法中使用多个阈值,我们可以成功提取日语地址字符线。从252件邮件中取出的957个地址字符线,实现了95.9%的正确提取率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号