首页> 外文会议>2011 International Conference on Document Analysis and Recognition >Document Images Indexing with Relevance Feedback: An Application to Industrial Context
【24h】

Document Images Indexing with Relevance Feedback: An Application to Industrial Context

机译:具有相关性反馈的文档图像索引:在工业环境中的应用

获取原文

摘要

This article presents a new method to index document images. This work is done in an industrial context where thousands of document images are daily digitized, these images have to be sorted in different classes like payroll, various bills, information letters. We propose a software method which aims to accelerate this task. Usually, the number of document classes is a priori unknown. In this paper, we propose an automatic estimation of this class number. According to this class number, we use a clustering algorithm in order to group document images. After this step, we propose an assisted classification tool based on content based image retrieval method (CBIR). For each cluster, a reference image is automatically selected then considering a similarity measure, the other images are sorted and shown to the user. By interacting with the process, the user can reject wrong images. The user feedback is automatically taken into account to enhance the similarity measure by weighting each feature. The first tests show that, on average, databases are indexed 3 times faster with our assisted classification method than with a standard manual classification process.
机译:本文介绍了一种索引文档图像的新方法。这项工作是在工业环境中完成的,每天要对成千上万的文档图像进行数字化处理,这些图像必须按薪资,各种账单,信息信函的不同类别进行分类。我们提出了一种旨在加速此任务的软件方法。通常,文档类别的数量是先验未知的。在本文中,我们提出了对该类别编号的自动估计。根据此类编号,我们使用聚类算法对文档图像进行分组。在此步骤之后,我们提出了一种基于基于内容的图像检索方法(CBIR)的辅助分类工具。对于每个聚类,将自动选择参考图像,然后考虑相似性度量,将其他图像分类并显示给用户。通过与过程进行交互,用户可以拒绝错误的图像。通过加权每个特征,自动考虑用户反馈以增强相似性度量。最初的测试表明,与标准的手动分类过程相比,使用我们的辅助分类方法对数据库进行索引的平均速度快3倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号