首页>
外国专利>
INFORMATION EXTRACTION FROM OPEN-ENDED SCHEMA-LESS TABLES
INFORMATION EXTRACTION FROM OPEN-ENDED SCHEMA-LESS TABLES
展开▼
机译:不限成员名额的无用表中的信息提取
展开▼
页面导航
摘要
著录项
相似文献
摘要
Systems and methods for generating and annotating cell documents include extracting tables from a document using a table extraction engine. Headers are extracted for each of the tables using a header detection engine. Cells are extracted from each of the tables using a cell extraction engine. A cell document is generated for each of the cells which are each correlated to corresponding portions of the headers, each cell document recording the correlation between the cells and the headers. Each cell document is annotated to generate annotated cell documents with a cell recognition model trained to perform natural language processing on the cell documents by classifying each term in each of the cell documents and extracting relationships between the terms of each of the cell documents.
展开▼