首页> 外文会议>ICMLC;International Conference on Machine Learning and Cybernetics >Intelligent document processing system for conference article
【24h】

Intelligent document processing system for conference article

机译:会议文章智能文件处理系统

获取原文

摘要

The conventional document processing systems include document analysis (DA), document classification, and document understanding. These systems are step by step. If the results in the previous step are improper, the current step will produce improper results. Furthermore, the binarization methods in DA to threshold an A4-sized color image are inefficient because they scan the entire image at least once. The block segmentation methods in DA to segment an A4-sized binary image are inefficient since they scan the entire image at least twice. The layout analysis methods in DA are also inefficient. They use global and local analysis and scan the entire image at least once. In this article, an intelligent, efficient, and effective document processing system is proposed to solve the abovementioned problems. The proposed method includes document binarization and mixed-based layout analysis. The binarization method only scans the border image. The mixed-based layout analysis mixed uses block segmentation and classification. The block segmentation only scans the background image. The block classification uses background gap and writing format to classify blocks. Experimental results show that the performance of the proposed method is better than FineReader 11.0 in visual measurement.
机译:传统的文档处理系统包括文档分析(DA),文档分类和文档理解。这些系统是逐步的。如果上一步的结果不正确,则当前步骤将产生不正确的结果。此外,DA中对A4大小的彩色图像进行阈值处理的二值化方法效率不高,因为它们至少扫描整个图像一次。 DA中的用于分割A4大小的二进制图像的块分割方法效率低下,因为它们至少扫描整个图像两次。 DA中的布局分析方法效率也不高。他们使用全局和局部分析,并至少扫描一次整个图像。在本文中,提出了一种智能,高效,有效的文档处理系统来解决上述问题。所提出的方法包括文档二值化和基于混合的布局分析。二值化方法仅扫描边界图像。基于混合的布局分析混合使用了块分割和分类。块分割仅扫描背景图像。块分类使用背景间隔和写入格式对块进行分类。实验结果表明,该方法在视觉测量方面的性能优于FineReader 11.0。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号