【24h】

Semi-automatic Training Sets Acquisition for Handwriting Recognition

机译:半自动训练集习得手写识别

获取原文
获取原文并翻译 | 示例

摘要

In this paper, a method of semi-automatic training set acquisition for character classifiers used in cursive handwriting recognition is described. The training set consists of character samples extracted from a training corpus by segmentation. The method first splits the word images from the corpus into a sequence of graphemes. Then, the set of candidate segmentation variants is elicited with an evolutionary algorithm, where the segmentation variant determines subdivision of grapheme sequences of words into subsequences corresponding to consecutive letters. Segmentation variants are modeled by a chromosome population. Next, each segmentation variant from the final population is tuned in an iterative process and the best chromosome is selected. Then character samples resulting from application of the segmentation modeled by the selected chromosome are grouped into sets corresponding to letters from the alphabet. Finally, the most outstanding samples are rejected so as to maximize the accuracy of words recognition obtained with a character classifier trained with the reduced samples set.
机译:本文介绍了一种用于草书手写识别中的字符分类器的半自动训练集获取方法。训练集包括通过分割从训练语料库中提取的角色样本。该方法首先将来自语料库的单词图像分成一系列字素。然后,用进化算法导出候选分割变体的集合,其中该分割变体确定单词的字素序列的细分为对应于连续字母的子序列。分段变体由染色体群体建模。接下来,在迭代过程中调整来自最终种群的每个分割变体,并选择最佳染色体。然后,将通过应用由选定染色体建模的分割而得到的字符样本分组为与字母表中的字母相对应的集合。最后,拒绝最优秀的样本,以最大程度地提高使用经过分类的样本集训练的字符分类器获得的单词识别的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号