首页>
外国专利>
EXTRACTING NAMED ENTITIES BASED USING DOCUMENT STRUCTURE
EXTRACTING NAMED ENTITIES BASED USING DOCUMENT STRUCTURE
展开▼
机译:使用文档结构提取命名实体
展开▼
页面导航
摘要
著录项
相似文献
摘要
The invention relates to a method for extracting at least one entity in at least one document (100), comprising the steps of Receiving the at least one document (100) as input data (S1); Identifying at least one block (10) in the at least one document (100) based on the structure or layout of the at least one document (S2); Determining at least one feature (12) associated with the identified at least one block (10), wherein the at least one feature relates to the content of the at least one block (10), structure of the at least one block (10) and/or other block (10) related information (S3); and Determining at least one score for the at least one block (10) based on the at least one block (10) and the associated at least one feature (12) using machine learning; wherein the at least one score is the likelihood that the at least one block (10) contains the at least one entity. Further, the invention relates to a corresponding computer program product and system.
展开▼