首页> 外国专利> SYSTEMS AND METHODS TO AUTOMATICALLY CLASSIFY ELECTRONIC DOCUMENTS USING EXTRACTED IMAGE AND TEXT FEATURES AND USING A MACHINE LEARNING SUBSYSTEM

SYSTEMS AND METHODS TO AUTOMATICALLY CLASSIFY ELECTRONIC DOCUMENTS USING EXTRACTED IMAGE AND TEXT FEATURES AND USING A MACHINE LEARNING SUBSYSTEM

机译:使用提取的图像和文本特征以及使用机器学习子系统对电子文档进行自动分类的系统和方法

摘要

A document analysis system that automatically classifies documents by recognizing in each document distinctive features comprises a document acquisition system, a document recognition training system, a document classification system, a document recognition system, and a job organization system. The document acquisition system receives jobs wherein each job containing at least one electronic document. The document feature recognition system automatically extracts image and text features from each received document. The document classification system automatically classifies recognized electronic documents by finding the best match between the extracted features of each of the document and feature sets associated with each category of document. The document recognition training system automatically trains the feature set for each corresponding category of documents, wherein the training system using extracted features of unrecognized documents automatically modifies the feature set for a document category. The job organization system automatically organizes each job according to the document categories it contains.
机译:通过识别每个文档中的不同特征来自动对文档进行分类的文档分析系统包括文档获取系统,文档识别培训系统,文档分类系统,文档识别系统和工作组织系统。文件获取系统接收作业,其中每个作业包含至少一个电子文件。文件特征识别系统会自动从每个收到的文件中提取图像和文本特征。文档分类系统通过找到每个文档的提取特征与与每个文档类别相关联的特征集之间的最佳匹配,对识别的电子文档进行自动分类。文件识别训练系统自动为每个对应类别的文件训练特征集,其中使用未识别文件的提取特征的训练系统自动为文件类别修改特征集。作业组织系统会根据其包含的文档类别自动组织每个作业。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号