首页> 外国专利> GENERATING DATA FROM IMBALANCED TRAINING DATA SETS

GENERATING DATA FROM IMBALANCED TRAINING DATA SETS

机译:从不平衡的训练数据集中生成数据

摘要

Injecting generated data samples into a minority data class of an imbalanced training data set is provided. In response to receiving an input to balance the imbalanced training data set that includes a majority data class and the minority data class, a set of data samples is generated for the minority data class. A distance is calculated from each data sample in the set of generated data samples to a center of a kernel that includes a set of data samples of the majority data class. Each data sample in the set of generated data samples is stored within a corresponding distance score bucket based on the calculated distance of a data sample. Generated data samples are selected from a number of highest ranking distance score buckets. The generated data samples selected from the number of highest ranking distance score buckets are injected into the minority data class.
机译:提供将生成的数据样本注入到不平衡训练数据集的少数数据类中。响应于接收到用于平衡包括多数派数据类和少数派数据类的不平衡训练数据集的输入,为少数派数据类生成一组数据样本。从所生成的数据样本集合中的每个数据样本到内核的中心计算距离,该内核的中心包括多数数据类的数据样本集合。基于所计算的数据样本的距离,在所生成的数据样本的集合中的每个数据样本被存储在对应的距离得分桶中。从多个最高排名距离得分桶中选择生成的数据样本。从数量最高的排名得分桶中选择的生成的数据样本被注入少数数据类。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号