首页> 外文会议>IAPR International Conference on Document Analysis and Recognition >Massive, Free and Reproducible Grountruthed Document Image Databases Generation with DocCreator
【24h】

Massive, Free and Reproducible Grountruthed Document Image Databases Generation with DocCreator

机译:使用DocCreator生成海量,免费和可复制的Gruntruthed文档图像数据库

获取原文

摘要

Whether your research is focused on image restoration, layout analysis, text-graphic separation, binarization, OCR, etc. you need a groundtruthed database to train your method or to evaluate it. This article presents DocCreator, a multi-platform and open-source software able to create many synthetic image documents with controlled groundtruth. With DocCreator, you can create complete synthetic images choosing the text, font, background and layout to use, add various realistic degradations (bleed-through, light defect, paper deformation, ink degradation, etc.) on original images, or combine both to increase the size of your database. DocCreator comes as an online (easy to test version) and a desktop solution (fast calculation process, and no need to upload copyrighted data). DocCreator is useful for retraining tasks and to know precisely whether your algorithm is robust. It has already been used favorably and could help other DIAR researchers to produce and share groundtruthed databases.
机译:无论您的研究重点是图像还原,布局分析,文本图形分离,二值化,OCR等,您都需要扎实的数据库来训练您的方法或对其进行评估。本文介绍了DocCreator,这是一个多平台的开放源代码软件,能够在受控的地面状态下创建许多合成图像文档。借助DocCreator,您可以创建完整的合成图像,选择要使用的文本,字体,背景和布局,在原始图像上添加各种逼真的降级效果(渗色,光缺陷,纸张变形,墨水降级等),或者将两者结合起来增加数据库的大小。 DocCreator具有在线(易于测试的版本)和桌面解决方案(快速的计算过程,无需上传受版权保护的数据)的形式。 DocCreator对于重新培训任务以及准确了解您的算法是否健壮很有用。它已经被很好地使用,并且可以帮助其他DIAR研究人员生产和共享真实的数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号