首页> 外文会议>International Conference on Advanced Technologies for Signal and Image Processing >Document class recognition using a support vector machine approach
【24h】

Document class recognition using a support vector machine approach

机译:使用支持向量机方法的文档类别识别

获取原文

摘要

In most document archiving systems, one of the main fields is to identify the category of documents. In most case, determination of the document category in archiving tasks requires the application of classification model, which have had successes in improving documents processing. However, concerns exploding the frequency of use of documents in many office managers have driven increasing interests in Document Image Analysis (DAI) system. An automated tool that can distinguish photographs, textual and mixed documents in heterogeneous dataset can be effective to reduce the complexity measure in archiving process. Otherwise, instead of applying the same archiving strategy on the all documents dataset, a machine vision system can be used for identifying only the archiving process which can be used for each input document. This paper examines the use of support vector machine (SVM) algorithm for the helpful classification of documents in heterogeneous digital documents dataset. Our purpose is to investigate if a sufficient classification rate can be achieved when SVM is employed as a classification model in an automated document archiving system. In our experiments, a mixture of low-level features that characterizes documents in dataset was tested to build the aforementioned classification model. The obtained results reveal that the proposed SVM model yields 96% accuracy over a set of 250 test documents.
机译:在大多数文档归档系统中,主要领域之一是识别文档的类别。在大多数情况下,在归档任务中确定文档类别需要使用分类模型,该模型在改善文档处理方面已取得成功。但是,在许多办公室经理中,随着文档使用频率的爆炸式增长,人们对文档图像分析(DAI)系统的兴趣日益浓厚。可以区分异构数据集中的照片,文本和混合文档的自动化工具可以有效地减少归档过程中的复杂性。否则,可以将机器视觉系统用于仅识别可用于每个输入文档的存档过程,而不是对所有文档数据集应用相同的存档策略。本文研究了使用支持向量机(SVM)算法对异构数字文档数据集中的文档进行有用的分类。我们的目的是研究在将SVM用作自动文档归档系统中的分类模型时,是否可以达到足够的分类率。在我们的实验中,测试了表征数据集中文档的各种低级特征,以构建上述分类模型。获得的结果表明,所提出的SVM模型在250个测试文档中可产生96%的准确度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号