首页> 外文会议>TPC Technology Conference on Performance Evaluation and Benchmarking >Composite Key Generation on a Shared-Nothing Architecture
【24h】

Composite Key Generation on a Shared-Nothing Architecture

机译:在共享无线架构上的综合键生成

获取原文

摘要

Generating synthetic data sets is integral to benchmarking, debugging, and simulating future scenarios. As data sets become larger, real data characteristics thereby become necessary for the success of new algorithms. Recently introduced software systems allow for synthetic data generation that is truly parallel. These systems use fast pseudorandom number generators and can handle complex schemas and uniqueness constraints on single attributes. Uniqueness is essential for forming keys, which identify single entries in a database instance. The uniqueness property is usually guaranteed by sampling from a uniform distribution and adjusting the sample size to the output size of the table such that there are no collisions. However, when it comes to real composite keys, where only the combination of the key attribute has the uniqueness property, a different strategy needs to be employed. In this paper, we present a novel approach on how to generate composite keys within a parallel data generation framework. We compute a joint probability distribution that incorporates the distributions of the key attributes and use the unique sequence positions of entries to address distinct values in the key domain.
机译:生成合成数据集是基准测试,调试和模拟未来方案的成本。随着数据集变得更大的,从而实现新算法的成功所必需的真实数据特性。最近引入的软件系统允许真正平行的合成数据生成。这些系统使用Fast Pseudorandom Manumators,可以在单个属性上处理复杂的模式和唯一性约束。唯一性对于形成键来说是必不可少的,该键标识数据库实例中的单个条目。通常通过从均匀分布采样并将样本大小调整到表的输出大小,以使唯一性属性得到保证,使得没有碰撞。但是,谈到真实的复合键时,只有关键属性的组合具有唯一性属性,需要采用不同的策略。在本文中,我们提出了一种关于如何在并行数据生成框架内生成复合键的新方法。我们计算联合概率分布,该分布包含关键属性的分布,并使用条目的唯一序列位置来解决密钥域中的不同值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号