首页> 外国专利> UTILIZING MACHINE LEARNING MODELS, POSITION-BASED EXTRACTION, AND AUTOMATED DATA LABELING TO PROCESS IMAGE-BASED DOCUMENTS

UTILIZING MACHINE LEARNING MODELS, POSITION-BASED EXTRACTION, AND AUTOMATED DATA LABELING TO PROCESS IMAGE-BASED DOCUMENTS

机译：利用机器学习模型，基于位置的提取和自动数据标签来处理基于图像的文档

页面导航

摘要
著录项
相似文献

摘要

A device may receive image data that includes an image of a document and lexicon data identifying a lexicon, and may perform an extraction technique on the image data to identify at least one field in the document. The device may utilize form segmentation to automatically generate label data identifying labels for the image data, and may process the image data, the label data, and data identifying the at least one field, with a first model, to identify visual features. The device may process the image data and the visual features, with a second model, to identify sequences of characters, and may process the image data and the sequences of characters, with a third model, to identify strings of characters. The device may compare the lexicon data and the strings of characters to generate verified strings of characters that may be utilized to generate a digitized document.

机译：设备可以接收包括识别词典的文档的图像和识别词典的映像的图像数据，并且可以在图像数据上执行提取技术以识别文档中的至少一个字段。该设备可以利用表格分割以自动生成用于图像数据的标签标识标签，并且可以利用第一模型处理识别至少一个字段的图像数据，标签数据和数据以识别视觉特征。该设备可以处理图像数据和视觉特征，其中具有第二模型，以识别字符的序列，并且可以利用第三模型处理图像数据和字符序列，以识别字符串。该设备可以比较词汇数据和字符串以生成可用于生成数字化文档的验证字符串。

著录项

公开/公告号EP3882814A1

专利类型
公开/公告日2021-09-22

原文格式PDF
申请/专利权人 ACCENTURE GLOBAL SOLUTIONS LIMITED;
展开▼

申请/专利号EP20200207951
发明设计人 TANNIRU RAJENDRA PRASAD;KULKARNI ADITI;VIJAYARAGHAVAN KOUSHIK M.;HIGGINS LUKE;SUN XIWEN;GREEN RILEY;CHING MAN LOK;CHEN JIAYI;LIU XIAOLEI;MOORE ISABELLA PHOEBE GROENEWEGEN;LEMA REUBEN;
展开▼

申请日2020-11-17
分类号G06K9;G06K9/46;G06K9/62;G06K9/72;G06N3/04;
国家 EP
入库时间 2022-08-24 21:10:55

相似文献

专利
外文文献
中文文献