A Comprehensive Analysis of Misclassified Handwritten Chinese Character Samples by Incorporating Human Recognition

机译：综合分析人类认可错误分类的手写汉字样本

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The development of convolutional neural networks (CNN) has led to revolutionary progress in the resolution of the offline handwritten Chinese character recognition (HCCR) problem. As the recognition rate on a standard offline HCCR testbed is outstanding, a few samples that remain misclassified have kindled our interest. In this paper, with the help of human recognition results, we present a comprehensive analysis of the samples misclassified by a state-of-the-art CNN model. We performed the analysis based on the top-1-votes, which are obtained from the statistical analysis of human recognition results, and derived the following conclusions: (1) the majority of samples with high top-1-votes were mis-labeled. Besides, by comparing the results of human recognition with that of CNN, some limitations of CNN that provide scope for further improvement are presented; (2) in the samples with medium top- 1-votes, it is shown that the samples with different confidence level have different characteristics. Specifically, some samples could be regarded as multi-label samples; (3) the samples with low top-1- votes are either wrongly written or written extensively in cursive style, which are difficult to match their given ground-truths; (4)the relationship between writing styles and misclassifications are also introduced in the paper. We believe this work should provide some insights and brings new clues on designing new classification methods to deal with these challenging samples.

机译：卷积神经网络（CNN）的发展导致了革命性的汉字识别（HCCR）问题的解决方案。随着标准离线的识别率HCCR测试率未突出，仍然错误分类的一些样本有用点燃了我们的兴趣。在本文中，在人为识别结果的帮助下，我们对由最先进的CNN模型进行错误分类的样品综合分析。我们基于前1票进行了分析，这些投票是从人类识别结果的统计分析中获得的，并得出以下结论：（1）大多数具有高前1票的样品被错误标记。此外，通过将人类识别结果与CNN的结果进行比较，提出了为进一步改进提供范围的CNN的一些限制; （2）在具有中等顶部1-投票的样品中，显示具有不同置信水平的样品具有不同的特性。具体地，一些样品可以被视为多标签样本; （3）具有低顶级投票的样品是错误地写入或广泛写入的草书风格，这很难与其给定的地面真理相匹配; （4）本文还介绍了写作风格和错误分类之间的关系。我们认为这项工作应提供一些见解，并为设计新的分类方法提供新的线索来处理这些挑战性样本。

著录项

来源
《IAPR International Conference on Document Analysis and Recognition》|2017年|732p|共6页
会议地点
作者
Kaihuan Liang; Lianwen Jin; Zecheng Xie; Xuefeng Xiao; Weiguo Huang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.41-53;
关键词
Character recognition; Writing; Machine learning; Handwriting recognition; Standards; Image color analysis; Color;

机译：字符识别;写作;机器学习;手写识别;标准;图像颜色分析;颜色;

相似文献

外文文献
中文文献
专利

1. Importance sampling based discriminative learning for large scale offline handwritten Chinese character recognition [J] . Wang Yanwei, Fu Qiang, Ding Xiaoqing, Pattern Recognition: The Journal of the Pattern Recognition Society . 2015,第4期

机译：基于重要性采样的判别学习用于大规模离线手写汉字识别
2. Research on Feature Extraction Method for Handwritten Chinese Character Recognition Based on Kernel Independent Component Analysis [J] . He Zhiguo, Yang Xiaoli Research journal of applied science, engineering and technology . 2013,第7期

机译：基于核独立成分分析的手写汉字识别特征提取方法研究
3. Research on Feature Extraction Method for Handwritten Chinese Character Recognition Based on Kernel Independent Component Analysis [J] . He Zhiguo, Yang Xiaoli Research journal of applied science, engineering and technology . 2013,第7期

机译：基于核独立成分分析的手写汉字识别特征提取方法研究
4. A Comprehensive Analysis of Misclassified Handwritten Chinese Character Samples by Incorporating Human Recognition [C] . Kaihuan Liang, Lianwen Jin, Zecheng Xie, IAPR International Conference on Document Analysis and Recognition . 2017

机译：结合人类识别技术对误分类汉字样本进行综合分析
5. Video-based handwritten Chinese character recognition. [D] . Lin, Feng. 2003

机译：基于视频的手写汉字识别。
6. BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters [O] . Mithun Biswas, Rafiqul Islam, Gautam Kumar Shom, 2017

机译：BanglaLekha隔离的：手写的Bangla隔离字符的多功能综合数据集
7. Online and Offline Handwritten Chinese Character Recognition: A Comprehensive Study and New Benchmark [O] . Zhang, Xu-Yao, Bengio, Yoshua, Liu, Cheng-Lin 2016

机译：在线和离线手写汉字识别：a 综合研究和新基准

A Comprehensive Analysis of Misclassified Handwritten Chinese Character Samples by Incorporating Human Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅