首页> 外文会议>ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems >Deterministic wavelet thresholding for maximum-error metrics
【24h】

Deterministic wavelet thresholding for maximum-error metrics

机译:最大误差度量的确定性小波阈值

获取原文

摘要

Several studies have demonstrated the effectiveness of the wavelet, decomposition as a tool for reducing large amounts of data down to compact, wavelet synopses that can be used to obtain fast, accurate approximate answers to user queries. While conventional wavelet synopses are based on greedily minimizing the overall root-mean-squared (i.e., L2-norm) error in the data approximation, recent work has demonstrated that such synopses can suffer from important problems, including severe bias and wide variance in the quality of the data reconstruction, and lack of non-trivial guarantees for individual approximate answers. As a result, probabilistic thresholding schemes have been recently proposed as a means of building wavelet synopses that try to probabilistically control other approximation-error metrics, such as the maximum relative error in data-value reconstruction, which is arguably the most important for approximate query answers and meaningful error guarantees.One of the main open problems posed by this earlier work is whether it is possible to design efficient deterministic wavelet-thresholding algorithms for minimizing non-L2 error metrics that are relevant to approximate query processing systems, such as maximum relative or maximum absolute error. Obviously, such algorithms can guarantee better wavelet synopses and avoid the pitfalls of probabilistic techniques (e.g., "bad" coin-flip sequences) leading to poor solutions. In this paper, we address this problem and propose novel, computationally efficient schemes for deterministic wavelet thresholding with the objective of optimizing maximum-error metrics. We introduce an optimal low polynomial-time algorithm for one-dimensional wavelet thresholding--our algorithm is based on a new Dynamic-Programming (DP) formulation, and can be employed to minimize the maximum relative or absolute error in the data reconstruction. Unfortunately, directly extending our one-dimensionalDP algorithm to multi-dimensional wavelets results in a super-exponential increase in time complexity with the data dimensionality. Thus, we also introduce novel, polynomial-time approximation schemes (with tunable approximation guarantees for the target maximum-error metric) for deterministic wavelet thresholding in multiple dimensions.
机译:多项研究表明,小波分解作为将大量数据缩减为紧凑的小波提要的一种工具,可以有效地获得用户查询的快速,准确的近似答案。尽管传统的小波提要基于贪婪地最小化数据近似中整体均方根(即 L 2 -norm)误差,但最近的工作表明提要可能遭受重要问题,包括严重的偏见和数据重构质量的巨大差异,以及缺乏对单个近似答案的简单保证。结果,最近提出了概率阈值方案作为构建小波提要的一种方法,以试图概率控制其他逼近误差度量,例如最大近似误差。数据值重构,这可能是近似查询答案和有意义的错误保证中最重要的。此早期工作提出的主要开放问题之一是,是否有可能设计有效的确定性小波阈值用于最小化与近似查询处理系统相关的非 L 2 错误度量的算法,例如最大相对或最大绝对错误。显然,这样的算法可以保证更好的小波提要,并且避免概率技术(例如“不良”硬币翻转序列)的陷阱,从而导致解决方案不佳。在本文中,我们解决了这个问题,并提出了新颖的,计算效率高的确定性小波阈值处理方案,目的是优化最大误差度量。我们针对一维小波阈值引入了最优低阶多项式时间算法-我们的算法基于新的动态编程(DP)公式,并且可以采用最小化数据重建中的最大相对误差或绝对误差。不幸的是,直接将我们的一维DP算法扩展到多维小波会导致时间复杂度随数据维数呈指数增长。因此,我们还介绍了用于多维确定性小波阈值处理的新颖的多项式时间逼近方案(具有目标最大误差度量的可调逼近保证)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号