首页> 外国专利> Method, apparatus and electronic device for determining knowledge sample data set

Method, apparatus and electronic device for determining knowledge sample data set

机译:用于确定知识样本数据集的方法,装置和电子设备

摘要

Provided are a method, an apparatus and an electronic device for determining a knowledge sample data set, the method includes: acquiring a preset number of SPO triplet formats and source texts; acquiring, according to the SPO triplet formats, n SPO entries corresponding to the SPO triplet formats; searching, in the source texts, m first texts that match the n SPO entries, and generating a first knowledge sample data set; determining k second texts that meet the SPO triplet formats from the m first texts and generating a second knowledge sample data set; generating a target knowledge sample data set according to the first knowledge sample data set and the second knowledge sample data set. In the embodiments, the knowledge sample data set is automatically generated, the volume generation speed is fast, the cost is low, and the data size that can be produced is large, thus meeting the training requirement.
机译:提供了一种用于确定知识样本数据集的方法,装置和电子设备,该方法包括:获取预设数量的SPO三联格式和源文本; 根据SPO三联格式获取,与SPO三联格式相对应的N SPO条目; 搜索,在源文本中,M符合N SPO条目的第一个文本,并生成第一个知识样本数据集; 确定符合M第一文本的SPO三联格式的k个第二文本,并生成第二知识样本数据集; 根据第一知识样本数据集和第二知识样本数据集生成目标知识样本数据集。 在实施例中,自动生成知识样本数据集,卷产生速度快,成本低,并且可以产生的数据大小大,从而满足训练要求。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号