首页> 外文会议>International Conference on Document Analysis and Recognition >Two Stream Deep Network for Document Image Classification
【24h】

Two Stream Deep Network for Document Image Classification

机译:两流深度网络用于文档图像分类

获取原文

摘要

This paper presents a novel two-stream approach for document image classification. The proposed approach leverages textual and visual modalities to classify document images into ten categories, including letter, memo, news article, etc. In order to alleviate dependency of textual stream on performance of underlying OCR (which is the case with general content based document image classifiers), we utilize a filter based feature-ranking algorithm. This algorithm ranks the features of each class based on their ability to discriminate document images and selects a set of top 'K' features that are retained for further processing. In parallel, the visual stream uses deep CNN models to extract structural features of document images.Finally, textual and visual streams are concatenated together using an average ensembling method. Experimental results reveal that the proposed approach outperforms the state-of-the-art system with a significant margin of 4.5% on publicly available Tobacco-3482 dataset.
机译:本文提出了一种新颖的两流方法进行文档图像分类。所提出的方法利用文本和视觉方式将文档图像分为十类,包括信件,备忘录,新闻文章等。为了减轻文本流对底层OCR性能的依赖性(在基于常规内容的文档图像中就是这种情况)分类器),我们利用基于过滤器的特征排名算法。该算法根据其区分文档图像的能力对每个类别的特征进行排序,并选择保留的一组顶级“ K”特征以进行进一步处理。并行地,视觉流使用深层CNN模型提取文档图像的结构特征。最后,文本和视觉流使用平均集合方法连接在一起。实验结果表明,所提出的方法优于最新系统,在公开提供的Tobacco-3482数据集上有4.5%的显着优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号