首页>
外国专利>
GENERATING DATA FROM IMBALANCED TRAINING DATA SETS
GENERATING DATA FROM IMBALANCED TRAINING DATA SETS
展开▼
机译:从不平衡的训练数据集中生成数据
展开▼
页面导航
摘要
著录项
相似文献
摘要
Injecting generated data samples into a minority data class of an imbalanced training data set is provided. In response to receiving an input to balance the imbalanced training data set that includes a majority data class and the minority data class, a set of data samples is generated for the minority data class. A distance is calculated from each data sample in the set of generated data samples to a center of a kernel that includes a set of data samples of the majority data class. Each data sample in the set of generated data samples is stored within a corresponding distance score bucket based on the calculated distance of a data sample. Generated data samples are selected from a number of highest ranking distance score buckets. The generated data samples selected from the number of highest ranking distance score buckets are injected into the minority data class.
展开▼