首页> 外文会议>Document Recognition III >Character recognition in a Japanese text recognition system
【24h】

Character recognition in a Japanese text recognition system

机译:日语文本识别系统中的字符识别

获取原文

摘要

Abstract: Cherry Blossom is a machine-printed Japanese document recognition system developed at CEDAR in past years. This paper focuses on the character recognition part of the system. for Japanese character classification, two feature sets are used in the system: one is the local stroke direction feature; another is the gradient, structural and concavity feature. Based on each of those features, two different classifiers are designed: one is the so-called minimum error subspace classifier; another is the fast nearest-neighbor (FNN) classifier. Although the original version of the FNN classifier uses Euclidean distance measurement, its new version uses both Euclidean distance and the distance calculation defined in the ME subspace method. This integration improved performance significantly. The number of character classes handled by those classifiers is about 3,300 (including alphanumeric, kana and level-1 Kanji JIS). Classifiers were trained and tested on 200 ppi character images from CEDAR Japanese character image CD-ROM. !11
机译:摘要:樱花是过去几年在雪松开发的机器印刷日本文件识别系统。本文重点介绍了系统的字符识别部分。对于日语字符分类,系统中使用了两个功能集:一个是局部笔划方向特征;另一个是梯度,结构和凹凸特征。基于这些功能中的每一个,设计了两个不同的分类器:一个是所谓的最小误差子空间分类器;另一个是最近的最近邻居(FNN)分类器。虽然FNN分类器的原始版本使用欧几里德距离测量,但其新版本使用欧几里德距离和ME子空间方法中定义的距离计算。这种集成显着提高了性能。这些分类器处理的字符类的数量约为3,300(包括字母数字,kana和1 level-1 kanji jis)。从Cedar日语字符图像CD-ROM培训和测试分类器并测试200 PPI字符图像。 !11

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号