首页> 外国专利> METHOD FOR GENERATING SYNTHETIC DATA SETS AT SCALE WITH NON-REDUNDANT PARTITIONING

METHOD FOR GENERATING SYNTHETIC DATA SETS AT SCALE WITH NON-REDUNDANT PARTITIONING

机译：具有非冗余分区的大规模生成综合数据集的方法

页面导航

摘要
著录项
相似文献

摘要

An example system includes a first machine and a second machine, a clustering module, and a training module. The clustering module receives a plurality of data sets, each including attributes. The clustering module partitions the plurality of data sets into a first clustered data set and a second clustered data set. Each data set of the plurality of data sets is partitioned. The training module assigns a first stochastic model to the first clustered data set and a second stochastic model to the second clustered data set. The first machine selects the first clustered data set and the first stochastic model and generates a first synthetic data set having generated data for each one of the attributes. The second machine selects the second clustered data set and the second stochastic model and generates a second synthetic data set having generated data for each one of the attributes.

机译：示例系统包括第一机器和第二机器，聚类模块和训练模块。聚类模块接收多个数据集，每个数据集包括属性。集群模块将多个数据集划分为第一集群数据集和第二集群数据集。多个数据集中的每个数据集被分区。训练模块将第一随机模型分配给第一集群数据集，将第二随机模型分配给第二集群数据集。第一机器选择第一集群数据集和第一随机模型，并生成具有针对每个属性的生成数据的第一合成数据集。第二机器选择第二集群数据集和第二随机模型，并生成第二合成数据集，该第二合成数据集具有针对每个属性的生成数据。

著录项

公开/公告号US2018107729A1

专利类型
公开/公告日2018-04-19

原文格式PDF
申请/专利权人 RED HAT INC.;
展开▼

申请/专利号US201615294142
发明设计人 JAY VYAS;RONALD NOWLING;HUAMIN CHEN;
展开▼

申请日2016-10-14
分类号G06F17/30;G06N99;
国家 US
入库时间 2022-08-21 13:04:57

相似文献

专利
外文文献
中文文献