首页> 外国专利> CROSS-SUBJECT MODEL-GENERATED TRAINING DATA FOR RELATION EXTRACTION MODELING

CROSS-SUBJECT MODEL-GENERATED TRAINING DATA FOR RELATION EXTRACTION MODELING

机译：用于关联提取建模的跨学科模型训练数据

页面导航

摘要
著录项
相似文献

摘要

A first vector representation of a first word within a first narrative text and a machine-generated label corresponding to the first word are constructed. Using the first vector representation, an annotator model is trained. The annotator model is configured to produce a set of probabilities, each probability in the set of probabilities representing a probable output annotation corresponding to a word within a narrative text. The training includes minimizing a difference between a first human-generated label corresponding to the first word and a first probable output annotation corresponding to the first word. Using the trained annotator model and a second narrative text, second training data is generated. The trained annotator model is configured to produce an output annotation corresponding to a word within a narrative text. The second training data is usable to train a relation extraction model.

机译：构造第一叙述文本中的第一单词的第一矢量表示以及与该第一单词相对应的机器生成的标签。使用第一矢量表示，训练注释器模型。注释器模型被配置为产生概率集合，该概率集合中的每个概率表示与叙述文本内的单词相对应的可能的输出注释。训练包括最小化对应于第一单词的第一人类产生的标签和对应于第一单词的第一可能输出注释之间的差异。使用训练的注释器模型和第二叙述文本，生成第二训练数据。训练的注释器模型被配置为产生与叙述文本内的单词相对应的输出注释。第二训练数据可用于训练关系提取模型。

著录项

公开/公告号US2020320171A1

专利类型
公开/公告日2020-10-08

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US201916372497
发明设计人 YOUNGJA PARK;TAESUNG LEE;ARPITA ROY;
展开▼

申请日2019-04-02
分类号G06F17/27;G06F17/24;G06K9/62;G06N7;G06N3/04;
国家 US
入库时间 2022-08-21 11:20:35

相似文献

专利
外文文献
中文文献