...
首页> 外文期刊>Data & Knowledge Engineering >Extending the data warehouse for service provisioning data
【24h】

Extending the data warehouse for service provisioning data

机译:扩展数据仓库以获取服务供应数据

获取原文
获取原文并翻译 | 示例

摘要

The last few years, there has been an extensive body of literature in data warehousing applications that primarily focuses on basket-type (transactional) data, common in retail industries. In this paper we focus on service provisioning data, that is data that is recorded internally in an organization for provisioning certain business related tasks. Coupling the recorded data with the underlying process and business-practice(s) that generate them is crucial for end-to-end analysis. Our framework is based on a graph description of the process (called a sketch) that is generating this data. Using this sketch, we formalize a new class of aggregate queries that consolidate data from a part of the process, based on a user defined path expression. We then show how to build a compact, non-redundant collection of summary (aggregate) tables and indices for this new type of queries. We first explore how to select a minimum set of views to answer queries with path-expressions over the given sketch. For queries that also include aggregation, we define two partial orders among the views. The first is used to pick the minimum set of aggregate views to answer any query with no false dismissals, while the second describes an augmented set that allows fewer false positives. Computing a non-materialized aggregate is done through appropriate rewriting of the user query. We describe two indexing schemes that use phantom (non-materialized) aggregate values to expedite query processing. Experimental results show these schemes to perform well on synthetic and real datasets.
机译:最近几年,在数据仓库应用中已有大量文献,主要集中在零售行业中常见的篮式(交易)数据上。在本文中,我们专注于服务供应数据,即在组织内部记录的用于供应某些业务相关任务的数据。将记录的数据与生成它们的基础流程和业务实践耦合起来,对于端到端分析至关重要。我们的框架基于生成该数据的过程的图形描述(称为草图)。使用此草图,我们可以根据用户定义的路径表达式,对一类新的聚合查询进行形式化,该查询将合并过程中一部分的数据。然后,我们说明如何为这种新型查询构建紧凑的,非冗余的汇总(汇总)表和索引集合。我们首先探索如何选择最小的视图集,以在给定草图上使用路径表达式来回答查询。对于还包括聚合的查询,我们在视图之间定义了两个部分顺序。第一个用于选择最少的聚合视图集,以回答任何没有误解的查询,而第二个描述一个扩展的集,允许更少的误报。通过适当重写用户查询来完成计算非具体化的聚合。我们描述了两种使用幻像(非实例化)聚合值来加快查询处理的索引方案。实验结果表明,这些方案在合成数据集和真实数据集上都能很好地执行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号