首页> 外文会议>International Conference on Frontiers in Handwriting Recognition >Segmentation and Stitching Improves Handwriting Recognition on Datasets with Few Samples
【24h】

Segmentation and Stitching Improves Handwriting Recognition on Datasets with Few Samples

机译:分割和缝合可提高少量样本对数据集的手写识别

获取原文

摘要

Through the mid-nineties, handwriting recognition was performed with segmentation-based approaches, which rely on hand-crafted and script-specific segmentation algorithms to isolate characters for subsequent recognition. These systems have since been eclipsed by segmentation-free systems that learn to segment implicitly and to recognize characters in a sequential signal automatically. It may be easy therefore to consider character segmentation as a solved problem. However, we show that for small training datasets, there is potential to leverage either manually- or automatically- provided segmentation information to guide the learning process in segmentation-free systems, leading to significantly higher generalization accuracy using few training exemplars. We show that providing just a small amount of segmentation information by isolating and synthesizing new sequences from a handwriting sample allows very small annotated datasets to achieve accuracy comparable to that of much larger annotated datasets at only a fraction of the human effort. We provide validation for this technique on the George Washington Papers word database, a similarly sized journal word corpus (the Smith collection), subsets of the well-known IAM handwriting database, and sequences of MNIST handwritten digits.
机译:在整个九十年代中期,手写识别都是通过基于分段的方法来执行的,该方法依靠手工制作和特定于脚本的分段算法来隔离字符以进行后续识别。此后,这些系统已被无分段系统所取代,该系统学会了隐式分段并自动识别顺序信号中的字符。因此,将字符分割视为已解决的问题可能很容易。但是,我们表明,对于小型训练数据集,有可能利用手动或自动提供的分段信息来指导无分段系统中的学习过程,从而使用很少的训练样本就能显着提高泛化精度。我们展示了通过从手写样本中分离和合成新序列而仅提供少量分割信息,就可以使非常小的带注释的数据集达到与大型带注释的数据集相当的准确性,而这仅需花费一小部分人工即可。我们在George Washington Papers单词数据库,大小相似的期刊单词语料库(Smith集合),著名的IAM手写数据库的子集以及MNIST手写数字序列上提供对该技术的验证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号