首页> 外国专利> DEEP DOCUMENT PROCESSING WITH SELF-SUPERVISED LEARNING

DEEP DOCUMENT PROCESSING WITH SELF-SUPERVISED LEARNING

机译:基于自监督学习的深度文档处理

摘要

A document processing system processes documents including typewritten and/or handwritten data by converting them to document images for entity extraction. A received document is initially processed to generate a deep document data structured and for classification as one of a structured or an unstructured document. If the document is classified as a structured document, it is processed for entity extraction based on a matching template and image alignment of the document image with the matching template. If the document is classified as an unstructured document, entities are extracted by obtaining nodes and providing the nodes to a self-supervised masked visual language model.
机译:文档处理系统通过将文档转换为用于实体提取的文档图像来处理文档,包括打印和/或手写数据。最初处理接收到的文档以生成深度文档数据,并将其分类为结构化或非结构化文档之一。如果文档被分类为结构化文档,则基于匹配模板和文档图像与匹配模板的图像对齐来处理文档以进行实体提取。如果文档被分类为非结构化文档,则通过获取节点并将节点提供给自监督掩码视觉语言模型来提取实体。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号