首页> 外文会议>Character Recognition Technologies >Effectiveness of feature and classifier algorithms in character recognition systems
【24h】

Effectiveness of feature and classifier algorithms in character recognition systems

机译:特征和分类器算法在字符识别系统中的有效性

获取原文
获取原文并翻译 | 示例

摘要

Abstract: first Census Optical Character Recognition Systems Conference, NIST generated accuracy data for more than character recognition systems. Most systems were tested on the recognition of isolated digits and upper and lower case alphabetic characters. The recognition experiments were performed on sample sizes of 58,000 digits, and 12,000 upper and lower case alphabetic characters. The algorithms used by the 26 conference participants included rule-based methods, image-based methods, statistical methods, and neural networks. The neural network methods included Multi-Layer Perceptron's, Learned Vector Quantitization, Neocognitrons, and cascaded neural networks. In this paper 11 different systems are compared using correlations between the answers of different systems, comparing the decrease in error rate as a function of confidence of recognition, and comparing the writer dependence of recognition. This comparison shows that methods that used different algorithms for feature extraction and recognition performed with very high levels of correlation. This is true for neural network systems, hybrid systems, and statistically based systems, and leads to the conclusion that neural networks have not yet demonstrated a clear superiority to more conventional statistical methods. Comparison of these results with the models of Vapnick (for estimation problems), MacKay (for Bayesian statistical models), Moody (for effective parameterization), and Boltzmann models (for information content) demonstrate that as the limits of training data variance are approached, all classifier systems have similar statistical properties. The limiting condition can only be approached for sufficiently rich feature sets because the accuracy limit is controlled by the available information content of the training set, which must pass through the feature extraction process prior to classification. !18
机译:摘要:在第一届人口普查光学字符识别系统会议上,NIST不仅为字符识别系统提供了精度数据。大多数系统都经过测试以识别孤立的数字以及大小写字母字符。识别实验是对58,000个数字的样本以及12,000个大小写字母字符进行的。 26名会议参与者使用的算法包括基于规则的方法,基于图像的方法,统计方法和神经网络。神经网络方法包括多层感知器,学习型矢量量化,新认知子和级联神经网络。在本文中,使用不同系统的答案之间的相关性比较了11个不同的系统,比较了错误率的降低与识别的置信度的函数,并比较了识别的作者依赖性。这种比较表明,使用不同算法进行特征提取和识别的方法具有很高的相关性。对于神经网络系统,混合系统和基于统计的系统而言,这是正确的,并得出结论:神经网络尚未表现出比更常规的统计方法明显的优势。将这些结果与Vapnick模型(用于估计问题),MacKay模型(用于贝叶斯统计模型),Moody模型(用于有效参数化)和Boltzmann模型(用于信息内容)进行比较表明,随着训练数据方差接近极限,所有分类器系统都具有相似的统计属性。仅对于足够丰富的特征集才能达到限制条件,因为精度限制由训练集的可用信息内容控制,训练内容的可用信息内容必须在分类之前经过特征提取过程。 !18

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号