首页> 外文会议>International Conference on Frontiers in Handwriting Recognition >Crowdsourcing Online Handwriting Acquisition to Develop and Deploy a Unicode Character Classifier
【24h】

Crowdsourcing Online Handwriting Acquisition to Develop and Deploy a Unicode Character Classifier

机译:众包在线手写获取以开发和部署Unicode字符分类器

获取原文

摘要

There are thousands of Unicode characters and hence it can be hard to visually find a particular one. For this reason, we aimed at developing a tool that allows to handwrite a character and receive a list of the most similar candidates to that input. This tool will be integrated in a math editor which handles more than 5,000 different Unicode characters. Since no public datasets were found to fit our needs, we crowdsourced the acquisition of online handwritten data for training purposes. We developed a neural network combining convolutional layers with shape-based features to classify online handwritten Unicode characters. To make the model more robust to input variability, we used data augmentation in the form of affine transformations. We achieved a top-20 error rate of 12.64% on validation data and received positive feedback from users, thus validating that crowdsourcing is a proper method for online handwriting acquisition. Finally, we deployed the model wrapped in a JSON-based REST API and released a public demo using it. This way, we present the full development cycle of a Unicode character classifier.
机译:有成千上万的Unicode字符,因此很难从视觉上找到特定的Unicode字符。因此,我们旨在开发一种工具,该工具可以手写字符并接收与该输入最相似的候选者列表。该工具将集成在一个数学编辑器中,该编辑器可以处理5,000多个不同的Unicode字符。由于找不到适合我们需求的公共数据集,因此我们出于培训目的将在线手写数据的获取众包。我们开发了将卷积层与基于形状的特征相结合的神经网络,以对在线手写Unicode字符进行分类。为了使模型对输入变异性更鲁棒,我们使用了仿射变换形式的数据增强。我们在验证数据上的前20个错误率达到了12.64%,并获得了用户的积极反馈,从而证明了众包是在线笔迹获取的正确方法。最后,我们部署了包装在基于JSON的REST API中的模型,并发布了使用该模型的公开演示。这样,我们介绍了Unicode字符分类器的完整开发周期。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号