首页>
外国专利>
Machine learning of document templates for data extraction
Machine learning of document templates for data extraction
展开▼
机译:文档模板的机器学习以进行数据提取
展开▼
页面导航
摘要
著录项
相似文献
摘要
The present system can perform machine learning of prototypical descriptions of data elements for extraction from machine-readable documents. Document templates are created from sets of training documents that can be used to extract data from form documents, such as: fill-in forms used for taxes; flex-form documents having many variants, such as bills of lading or insurance notifications; and some context-form documents having a description or graphic indicator in proximity to a data element. In response to training documents, the system performs an inductive reasoning process to generalize a document template so that the location of data elements can be predicted for the training examples. The automatically generated document template can then be used to extract data elements from a wide variety of form documents.
展开▼