首页> 外国专利> Training an on-line handwriting recognizer

Training an on-line handwriting recognizer

机译:训练在线手写识别器

摘要

Character model graphs are created, and the parameters of the model graphs are adjusted to optimize character recognition performed with the model graphs. In effect the character recognizer using the model graphs is trained. The model graphs are created in three stages. First, a vector quantization process is used on a set of raw samples of handwriting symbols to create a smaller set of generalized reference characters or symbols. Second, a character reference model graph structure is created by merging each generalized form model graph of the same character into a single character reference model graph. The merging is based on weighted Euclidian distance between parts of trajectory assigned to graph edges. As a last part of this second stage “type-similarity” vectors are assigned to model edges to describe similarities of given model edge to each shape and to each possible quantized value of other input graph edge parameters. Thus, similarity functions, or similarity values, are defined by different tables on different model edges. In the third stage, model creation further consists of minimizing recognition error by adjusting model graphs parameters. An appropriate smoothing approximation is used in the calculation of similarity score between input graph and model graphs. The input graph represents a word from a work sample set used for training, i.e. adjusting the model graph parameters. A recognition error is calculated as a function of the difference between similarity scores for best answers and the one correct answer for the word being recognized. The gradient of the recognition error as a function of change in parameters is computed and used to adjust the parameters. Model graphs with adjusted parameters are then used to recognize the words in a test set, and a percent of correct recognitions in the test set is calculated. The recognition error calculation with the work set, the parameter adjustment process, and the calculation of the percent of correct recognitions with the test set is repeated. After a number of iterations of this process, the optimum set of parameters for the model graphs will be found.
机译:创建角色模型图,并调整模型图的参数以优化使用模型图执行的字符识别。实际上,训练了使用模型图的字符识别器。模型图分为三个阶段创建。首先,对一组手写符号原始样本进行矢量量化处理,以创建较小的一组广义参考字符或符号。其次,通过将相同字符的每个广义形式模型图合并为单个字符参考模型图,来创建字符参考模型图结构。合并基于分配给图边缘的轨迹部分之间的加权欧几里得距离。作为第二阶段的最后部分,将“类型相似性”向量分配给模型边缘,以描述给定模型边缘与每种形状以及与其他输入图边缘参数的每个可能量化值的相似性。因此,相似性函数或相似性值由不同模型边缘上的不同表定义。在第三阶段,模型创建还包括通过调整模型图参数来最大程度地减少识别错误。在计算输入图和模型图之间的相似性得分时,将使用适当的平滑近似。输入图表示来自用于训练(即调整模型图参数)的工作样本集中的单词。根据最佳答案的相似性得分与被识别单词的一个正确答案之间的差异来计算识别错误。计算识别误差随参数变化而变化的梯度,并用于调整参数。然后使用具有调整参数的模型图来识别测试集中的单词,并计算出测试集中正确识别的百分比。重复使用工作集进行识别误差计算,参数调整过程以及使用测试集进行正确识别百分比的计算。经过多次此过程的迭代,将找到模型图的最佳参数集。

著录项

  • 公开/公告号US7382921B2

    专利类型

  • 公开/公告日2008-06-03

    原文格式PDF

  • 申请/专利权人 ILIA LOSSEV;NATALIA BAGOTSKAYA;

    申请/专利号US20040848650

  • 发明设计人 ILIA LOSSEV;NATALIA BAGOTSKAYA;

    申请日2004-05-18

  • 分类号G06K9/00;

  • 国家 US

  • 入库时间 2022-08-21 20:10:10

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号