首页> 外文期刊>International Journal on Document Analysis and Recognition >Discovering similar Chinese characters in online handwriting with deep convolutional neural networks
【24h】

Discovering similar Chinese characters in online handwriting with deep convolutional neural networks

机译:深度卷积神经网络在在线手写中发现相似的汉字

获取原文
获取原文并翻译 | 示例
       

摘要

A primary reason for performance degradation in unconstrained online handwritten Chinese character recognition is the subtle differences between similar characters. Various methods have been proposed in previous works to address the problem of generating similar characters. These methods are basically comprised of two components-similar character discovery and cascaded classifiers. The goal of similar character discovery is to make similar character pairs/sets cover as many misclassified samples as possible. It is observed that the confidence of convolutional neural network (CNN) is output by an end-to-end manner and it can be understood as one type of probability metric. In this paper, we propose an algorithm by leveraging CNN confidence for discovering similar character pairs/sets. Specifically, a deep CNN is applied to output the top ranked candidates and the corresponding confidence scores, followed by an accumulating and averaging procedure. We experimentally found that the number of similar character pairs for each class is diverse and the confusion degree of similar character pairs is varied. To address these problems, we propose an entropy- based similarity measurement to rank these similar character pairs/sets and reject those with low similarity. The experimental results indicate that by using 30,000 similar character pairs, our method achieves the hit rates of 98.44 and 98.05 % on CASIA-OLHWDB1.0 and CASIA-OLHWDB1.0-1.2 datasets, respectively, which are significantly higher than corresponding results produced by MQDF-based method (95.42 and 94.49 %). Furthermore, recognition of ten randomly selected similar character subsets with a two-stage classification scheme results in a relative error reduction of 30.11 % comparing with traditional single stage scheme, showing the potential usage of the proposed method.
机译:在不受约束的在线手写汉字识别中,性能下降的主要原因是相似字符之间的细微差别。在先前的工作中已经提出了各种方法来解决产生相似字符的问题。这些方法基本上由两个组件组成-相似字符发现和级联分类器。相似字符发现的目的是使相似字符对/集覆盖尽可能多的错误分类的样本。可以看出,卷积神经网络(CNN)的置信度是以端到端的方式输出的,可以理解为一种概率度量。在本文中,我们提出了一种利用CNN置信度来发现相似字符对/集合的算法。具体来说,使用深层CNN来输出排名靠前的候选者和相应的置信度分数,然后进行累加和平均过程。我们通过实验发现,每个类别的相似字符对的数量是不同的,相似字符对的混淆程度是变化的。为了解决这些问题,我们提出了一种基于熵的相似度测量方法来对这些相似字符对/集合进行排序,并拒绝那些相似度较低的字符对/集合。实验结果表明,通过使用30,000个相似的字符对,我们的方法在CASIA-OLHWDB1.0和CASIA-OLHWDB1.0-1.2数据集上的命中率分别达到98.44%和98.05%,明显高于由对应的结果。基于MQDF的方法(95.42和94.49%)。此外,与传统的单阶段方案相比,采用两阶段分类方案识别十个随机选择的相似字符子集可以使相对误差减少30.11%,这表明了该方法的潜在用途。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号