首页> 外国专利> SYSTEMS AND METHODS FOR INITIAL LEARNING OF AN ADAPTIVE DETERMINISTIC CLASSIFIER FOR DATA EXTRACTION

SYSTEMS AND METHODS FOR INITIAL LEARNING OF AN ADAPTIVE DETERMINISTIC CLASSIFIER FOR DATA EXTRACTION

机译:用于数据提取的自适应确定性分类器的初始学习的系统和方法

摘要

This disclosure relates to initial learning of a classifier for automating extraction of structured data from unstructured or semi-structured data. In one embodiment, a method is disclosed, comprising: identifying at least one expected relation class associated with at least one expected relation data; populating at least one expected name entity data from the at least one identified expected relation class; generating training data by tagging the at least one expected relation data and the at least one identified expected relation class with unstructured or semi-structured data; generating feedback data for a relation data and relation class, using a convergence technique on the tagged training data; retuning a NE classifier cluster and a relation classifier cluster by continuously tagging new training data or generating new cascaded expression for a deterministic classifier and a statistical classifier; and extracting the structured data when the NE classifier cluster and the relation classifier cluster converge.
机译:本公开内容涉及用于自动从非结构化或半结构化数据提取结构化数据的分类器的初始学习。在一个实施例中,公开了一种方法,包括:识别与至少一个期望关系数据相关联的至少一个期望关系类别;从至少一个识别出的期望关系类中填充至少一个期望名称实体数据;通过用非结构化或半结构化数据标记至少一个期望关系数据和至少一个识别出的期望关系类别来生成训练数据;使用对标记的训练数据的收敛技术,为关系数据和关系类生成反馈数据;通过连续标记新的训练数据或为确定性分类器和统计分类器生成新的级联表达式来重新调整NE分类器集群和关系分类器集群;当NE分类器集群和关系分类器集群融合时,提取结构化数据。

著录项

  • 公开/公告号US2019236492A1

    专利类型

  • 公开/公告日2019-08-01

    原文格式PDF

  • 申请/专利权人 WIPRO LIMITED;

    申请/专利号US201815922983

  • 发明设计人 SAMRAT SAHA;

    申请日2018-03-16

  • 分类号G06N99;G06F17/30;

  • 国家 US

  • 入库时间 2022-08-21 12:07:41

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号