首页>
外国专利>
Histogram construction using adaptive random sampling with cross-validation for database systems
Histogram construction using adaptive random sampling with cross-validation for database systems
展开▼
机译:使用交叉验证的自适应随机抽样构建直方图
展开▼
页面导航
摘要
著录项
相似文献
摘要
Using adaptive random sampling with cross-validation helps determine when enough data of a database has been sampled to construct histograms on one or more columns of one or more tables of the database within a desired or predetermined degree of accuracy. An adaptive random sampling histogram construction tool constructs an approximate equi-height k-histogram using an initial sample of data values from the database and iteratively updates the histogram using an additional sample of data values from the database until the histogram is within the desired degree of accuracy. The accuracy of the histogram is cross-validated against the additional sample at each iteration, and the additional sample is used to update the histogram to help improve its accuracy. The accuracy of the histogram may be measured by an error in distribution of the additional sample over the histogram as compared to a threshold error using a suitable error metric. By attempting to sample only the number of data values necessary to construct the histogram within the desired degree of accuracy, the adaptive random sampling histogram construction tool attempts to avoid any cost increases in time and memory from sampling too many data values.
展开▼