【24h】

On Random Sampling over Joins

机译:在加入的随机抽样上

获取原文

摘要

A major bottleneck in implementing sampling as a primitive relational operation is the inefficiency of sampling the output of a query. It is not even known whether it is possible to generate a sample of a join tree without first evaluating the join tree completely. We undertake a detailed study of this problem and attempt to analyze it in a variety of settings. We present theoretical results explaining the difficulty of this problem and setting limits on the efficiency that can be achieved. Based on new insights into the interaction between join and sampling, we develop join sampling techniques for the settings where our negative results do not apply. Our new sampling algorithms are significantly more efficient than those known earlier. We present experimental evaluation of our techniques on Microsoft's SQL Server 7.0.
机译:实施采样作为原始关系操作的主要瓶颈是对查询输出进行采样的效率。甚至不知道是否可以生成连接树的样本,而无需完全评估连接树。我们对此问题进行了详细研究,并试图在各种环境中分析它。我们提出了理论结果,解释了这个问题的难度和确定可以实现的效率的限制。基于新的见解进入加入和采样之间的交互,我们开发加入采样技术,了解我们的负面结果不适用的设置。我们的新采样算法明显比前面已知的效率更高。我们对Microsoft的SQL Server 7.0提供了我们技术的实验评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号