【24h】

Script Identification - A Han Roman Script Perspective

机译:脚本识别 - 汉语和罗马脚本透视

获取原文

摘要

All Han-based scripts (Chinese, Japanese, and Korean) possess similar visual characteristics. Hence system development for identification of Chinese, Japanese and Korean scripts from a single document page is quite challenging. It is noted that a Han-based document page might also have Roman script in them. A multi-script OCR system dealing with Chinese, Japanese, Korean, and Roman scripts, demands identification of scripts before execution of respective OCR modules. We propose a system to address this problem using directional features along with a Gaussian Kernel-based Support Vector Machine. We got promising results of 98.39% script identification accuracy at character level and 99.85% at block level, when no rejection was considered.
机译:所有基于汉族的脚本(中文,日语和韩语)都具有类似的视觉特征。因此,系统开发用于识别中文,日语和韩国脚本从单个文档页面的识别是非常具有挑战性的。有人指出,基于汉语的文档页面也可能有罗马脚本。处理中文,日语,韩语和罗马脚本的多脚本OCR系统要求在执行各自的OCR模块之前识别脚本。我们提出了一个系统使用方向特征以及基于高斯内核的支持向量机来解决这个问题。当没有考虑拒绝时,我们在角色水平处获得了98.39%的剧本识别准确度的有希望的剧本识别准确度,99.85%。

著录项

获取原文

联系方式:18141920177 (微信同号)

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号