首页> 外文会议>International workshop on graphics recognition >Visual Saliency and Terminology Extraction for Document Classification
【24h】

Visual Saliency and Terminology Extraction for Document Classification

机译:视觉显着性和术语提取,用于文档分类

获取原文

摘要

The document digitization process becomes a crucial economical issue in our society. Then, it becomes necessary to be able to organize this huge amount of documents. The work proposed in this paper tends to propose a new method to automatically classify documents using a saliency-based segmentation process on one hand, and a terminology extraction and annotation on the other hand. The saliency-based segmentation is used to extract salient regions and by the way logo, while the terminology approach is used to annotate them and to automatically classify the document. The approach does not require human expertise, and use Google Images as a knowledge database. The results obtained on a real database of 1766 documents show the relevance of the approach.
机译:文件数字化过程成为我们社会中至关重要的经济问题。然后,必须能够组织大量的文档。本文提出的工作趋向于提出一种新方法,该方法一方面使用基于显着性的分割过程,另一方面使用术语提取和注释来自动对文档进行分类。基于显着性的分割用于提取显着区域并通过徽标进行分类,而术语方法则用于对其进行批注并自动对文档进行分类。该方法不需要专业知识,并且可以将Google图片用作知识数据库。在1766个文档的真实数据库中获得的结果表明了该方法的相关性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号