A first vector representation of a first word within a first narrative text and a machine-generated label corresponding to the first word are constructed. Using the first vector representation, an annotator model is trained. The annotator model is configured to produce a set of probabilities, each probability in the set of probabilities representing a probable output annotation corresponding to a word within a narrative text. The training includes minimizing a difference between a first human-generated label corresponding to the first word and a first probable output annotation corresponding to the first word. Using the trained annotator model and a second narrative text, second training data is generated. The trained annotator model is configured to produce an output annotation corresponding to a word within a narrative text. The second training data is usable to train a relation extraction model.
展开▼