【24h】

Recovering Information from Summary Data

机译:从摘要数据中恢复信息

获取原文
获取原文并翻译 | 示例

摘要

Data is often stored in summarized form, as a histogram of aggregates (COUNTs, SUMs, or AVeraGes) over specified ranges. We study how to estimate the original detail data from the stored summary.rnWe formulate this task as an inverse problem, specifying a well-defined cost function that has to be optimized under constraints. We show that our formulation includes the uniformity and independence assumptions as a special case, and that it can achieve better reconstruction results if we maximize the smoothness as opposed to the uniformity. In our experiments on real and synthetic datasets, the proposed method almost consistently outperforms its competitor, improving the root-mean-square error by up to 20 per cent for stock price data, and up to 90 per cent for smoother data sets.rnFinally, we show how to apply this theory to a variety of database problems that involvernpartial information, such as OLAP, data warehousing and histograms in query optimization.
机译:数据通常以汇总形式存储,作为指定范围内的聚合(直方图,COUNT,SUM或AVeraGes)的直方图。我们研究如何从存储的摘要中估计原始明细数据。我们将此任务公式化为一个反问题,指定了一个明确定义的成本函数,该函数必须在约束条件下进行优化。我们表明,我们的公式包括均匀性和独立性假设作为特例,并且如果我们将平滑度最大化而不是均匀性,则可以取得更好的重构结果。在我们对真实和合成数据集进行的实验中,所提出的方法几乎始终优于竞争对手,将股票价格数据的均方根误差提高了20%,将更平滑的数据集的均方根误差提高了90%。我们展示了如何将此理论应用于涉及局部信息的各种数据库问题,例如OLAP,数据仓库和查询优化中的直方图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号