首页> 外文会议>International Workshop on Web Intelligence >Selecting Adequate Samples for Approximate Decision Support Queries
【24h】

Selecting Adequate Samples for Approximate Decision Support Queries

机译:选择适用于近似决策支持查询的样本

获取原文

摘要

For highly selective queries, a simple random sample of records drawn from a large data warehouse may not contain sufficient number of records that satisfy the query conditions. Efficient sampling schemes for such queries require innovative techniques that can access records that are relevant to each specific query. In drawing the sample, it is advantageous to know what would be an adequate sample size for a given query. This paper proposes methods for picking adequate samples that ensure approximate query results with a desired level of accuracy. A special index based on a structure known as the k-MDI Tree is used to draw samples. An unbiased estimator named inverse simple random sampling without replacement is adapted to estimate adequate sample sizes for queries. The methods are evaluated experimentally on a large real life data set. The results of evaluation show that adequate sample sizes can be determined such that errors in outputs of most queries are within the acceptable limit of 5%.
机译:对于高度选择性查询,从大数据仓库中汲取的简单随机样本可能不包含满足查询条件的足够数量的记录。用于此类查询的有效采样方案需要创新的技术可以访问与每个特定查询相关的记录。在绘制样品时,有利于知道给定查询的适当样本大小是有利的。本文提出了采摘适当样本的方法,确保近似查询结果具有所需的准确度。基于称为K-MDI树的结构的特殊索引用于绘制样品。未偏见的估计器命名为逆简单随机采样而无需替换,适用于估计查询的足够样本尺寸。这些方法在实验上在大型实际数据集上进行评估。评估结果表明,可以确定适当的样本尺寸,使得大多数查询的输出中的误差在可接受的限度范围内为5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号