首页> 外文学位 >Properties and Applications of Diamond Cubes.
【24h】

Properties and Applications of Diamond Cubes.

机译:金刚石立方体的性质和应用。

获取原文
获取原文并翻译 | 示例

摘要

Queries that constrain multiple dimensions simultaneously are difficult to express, in both Structured Query Language (SQL) and multidimensional languages. Moreover, they can be inefficient. We introduce the diamond cube operator to facilitate the expression and computation of one such important class of multidimensional query, and we define the resulting cube to be a diamond.;The diamond cube operator identifies an important cube such that all remaining attributes are strongly related. For example, suppose a company wants to close shops and terminate product lines, whilst meeting some profitability constraints simultaneously over both products and shops. The diamond operator would identify the maximal set of products and shops that satisfy those constraints.;We determine the complexity of computing diamonds and investigate how pre-computed materialised views can speed execution. Views are defined by the parameter k associated with each dimension in every data cube. We prove that there is only one k1, k2,...,kd-diamond for a given cube. By varying the ki's we get a collection of diamonds for a cube and these diamonds form a lattice.;We also determine the complexity of discovering the most-constrained diamond that has a non-empty solution. By executing binary search over theoretically-derived bounds, we compute such a diamond efficiently.;Diamonds are defined over data cubes where the measures all have the same sign. We prove that the more general case---computing a diamond where measures include both positive and negative values---is NP-hard.;Finding dense subcubes in large data is a difficult problem. We investigate the role that diamonds play in the solution of three NP-hard problems that seek dense subcubes: Largest Perfect Cube, Densest Cube with Limited Dimensions and Heaviest Cube with Limited Dimensions.;We are interested in processing large data sets. We validated our algorithms on a variety of freely-available and synthetically-generated data cubes, whose dimensionality range from three to twenty-seven. Most of the cubes contain more than a million facts, and the largest has more than 986 million facts. We show that our custom implementation is more than twenty-five times faster, on a large data set, than popular database engines.
机译:在结构化查询语言(SQL)和多维语言中,难以同时约束多个维度的查询。而且,它们可能效率低下。我们引入菱形多维数据集运算符来简化此类重要的多维查询的表达和计算,并将生成的多维数据集定义为菱形。菱形多维数据集运算符识别出一个重要的多维数据集,使得所有其余属性都紧密相关。例如,假设一家公司希望关闭商店并终止产品线,同时在产品和商店上同时满足一些盈利限制。钻石操作员将确定满足这些约束的最大产品和商店集合。我们确定钻石的计算复杂性,并研究预先计算的物化视图如何加快执行速度。视图由与每个数据多维数据集中的每个维度关联的参数k定义。我们证明给定立方体只有一个k1,k2,...,kd金刚石。通过改变ki值,我们得到了一组立方体的钻石,这些钻石形成了晶格。我们还确定了发现具有非空解决方案的最受约束钻石的复杂性。通过在理论上得出的边界上执行二进制搜索,我们可以有效地计算出这种菱形。金刚石是在所有度量均具有相同符号的数据立方体上定义的。我们证明更一般的情况-计算包含正负两个值的钻石-是NP困难的;在大数据中查找密集子立方体是一个难题。我们研究了钻石在寻求密集子立方体的三个NP困难问题的解决方案中的作用:最大的理想立方体,有限尺寸的密集立方体和有限尺寸的重立方体。;我们对处理大数据集感兴趣。我们在各种可自由获取和合成生成的数据多维数据集上验证了我们的算法,这些多维数据集的维数范围为3到27。大多数多维数据集包含超过一百万个事实,而最大的多维数据集包含超过9.86亿个事实。我们证明,在大型数据集上,自定义实现比流行的数据库引擎快25倍以上。

著录项

  • 作者

    Webb, Hazel Jane.;

  • 作者单位

    University of New Brunswick (Canada).;

  • 授予单位 University of New Brunswick (Canada).;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 140 p.
  • 总页数 140
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号