首页>
外国专利>
AUTOMATED DATA EXTRACTION SYSTEM BASED ON HISTORICAL OR RELATED DATA
AUTOMATED DATA EXTRACTION SYSTEM BASED ON HISTORICAL OR RELATED DATA
展开▼
机译:基于历史或相关数据的自动数据提取系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
A system and method for data extraction from structured documents using historical or related data. Structured documents are searched for instances of an attribute value that match a known historical value for the attribute. Document features associated with the attribute value are identified and anchor a location within the hierarchy of the document structure where the attribute value can be found and extracted. An accuracy for the identified anchors is determined by evaluating how well the anchor's extraction history matches the reported history. Anchors are grouped into anchor sets such that all anchors in a set extract attributes from the same structured document template. The anchors are prioritized according to the determined accuracy, the prioritized list defining the order in which a structure document template should be searched for an attribute value.
展开▼