首页>
外国专利>
TEXT BASED SCHEMA DISCOVERY AND INFORMATION EXTRACTION
TEXT BASED SCHEMA DISCOVERY AND INFORMATION EXTRACTION
展开▼
机译:基于文本的模式发现和信息提取
展开▼
页面导航
摘要
著录项
相似文献
摘要
Various technologies and techniques are disclosed for text based schema discovery and information extraction. Documents are analyzed to identify sections of the documents and a relationship between the sections. Statistics are stored regarding occurrences of items in the documents. A probabilistic model is generated based on the stored statistics. A database schema is generated with a plurality of tables based upon the probabilistic model. The documents are analyzed against the probabilistic model to determine how the documents map to the tables generated from the database schema. The tables are populated from the documents based on a result of the analysis against the probabilistic model.
展开▼