首页> 外国专利> METHOD AND SYSTEM FOR EXTRACTING DATA FROM IMAGES OF SEMISTRUCTURED DOCUMENTS

METHOD AND SYSTEM FOR EXTRACTING DATA FROM IMAGES OF SEMISTRUCTURED DOCUMENTS

机译:从半结构化文档的图像中提取数据的方法和系统

摘要

FIELD: physics.;SUBSTANCE: text representation of the document image is obtained in the process of extracting data from the fields to the document image. A graph is constructed to store attributes of the document text fragments and the links between them. A cascade classification is made to calculate the attributes of the document text fragments and the links between them. A set of hypotheses is formed about the text fragment affiliation in the fields on the document image. A combination of hypotheses is selected. And data extracting is done from the fields on the document image based on the selected combination of the hypotheses.;EFFECT: saving computing resources.;15 cl, 8 dwg
机译:领域:物理。;实体:文档图像的文本表示是在从字段中提取数据到文档图像的过程中获得的。构造了一个图来存储文档文本片段的属性以及它们之间的链接。进行级联分类以计算文档文本片段的属性以及它们之间的链接。关于文档图像上的字段中的文本片段隶属关系形成了一组假设。选择假设的组合。然后根据所选的假设组合从文档图像上的字段中提取数据。效果:节省计算资源; 15 cl,8 dwg

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号