首页> 外文期刊>Distributed and Parallel Databases >Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams
【24h】

Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams

机译:流多维数据集:用于数据流的多维分析的体系结构

获取原文
获取原文并翻译 | 示例

摘要

Real-time surveillance systems, telecommunication systems, and other dynamic environments often generate tremendous (potentially infinite) volume of stream data: the volume is too huge to be scanned multiple times. Much of such data resides at rather low level of abstraction, whereas most analysts are interested in relatively high-level dynamic changes (such as trends and outliers). To discover such high-level characteristics, one may need to perform on-line multi-level, multi-dimensional analytical processing of stream data. In this paper, we propose an architecture, called stream_cube, to facilitate on-line, multi-dimensional, multi-level analysis of stream data. For fast online multi-dimensional analysis of stream data, three important techniques are proposed for efficient and effective computation of stream cubes. First, a tilted time frame model is proposed as a multi-resolution model to register time-related data: the more recent data are registered at finer resolution, whereas the more distant data are registered at coarser resolution. This design reduces the overall storage of time-related data and adapts nicely to the data analysis tasks commonly encountered in practice. Second, instead of materializing cuboids at all levels, we propose to maintain a small number of critical layers. Flexible analysis can be efficiently performed based on the concept of observation layer and minimal interesting layer. Third, an efficient stream data cubing algorithm is developed which computes only the layers (cuboids) along a popular path and leaves the other cuboids for query-driven, on-line computation. Based on this design methodology, stream data cube can be constructed and maintained incrementally with a reasonable amount of memory, computation cost, and query response time. This is verified by our substantial performance study. Stream data cube architecture facilitates online analytical processing of stream data. It also forms a preliminary data structure for online stream data mining. The impact of the design and implementation of stream data cube in the context of stream data mining is also discussed in the paper.
机译:实时监控系统,电信系统和其他动态环境通常会生成大量(可能是无限的)流数据:该数据量太大,无法多次扫描。此类数据中的大多数都处于相当低的抽象水平,而大多数分析人员对相对高层的动态变化(例如趋势和异常值)感兴趣。为了发现这种高级特征,可能需要对流数据执行在线的多级,多维分析处理。在本文中,我们提出了一种称为stream_cube的体系结构,以方便对流数据进行在线,多维,多级分析。为了对流数据进行快速的在线多维分析,提出了三种重要的技术来有效地计算流多维数据集。首先,提出一种倾斜的时间框架模型作为多分辨率模型来注册与时间相关的数据:以较高分辨率记录较新的数据,而以较粗分辨率记录较远的数据。这种设计减少了与时间相关的数据的总体存储,并很好地适应了实践中常见的数据分析任务。其次,我们建议不要保留所有级别的长方体,而建议保留少量关键层。基于观察层和最小关注层的概念,可以有效地执行灵活的分析。第三,开发了一种有效的流数据求取算法,该算法仅计算沿流行路径的图层(立方体),而将其他立方体留给查询驱动的在线计算。基于这种设计方法,可以以合理的内存,计算成本和查询响应时间来逐步构造和维护流数据多维数据集。我们的大量性能研究证明了这一点。流数据多维数据集体系结构有助于对流数据进行在线分析处理。它还为在线流数据挖掘形成了初步的数据结构。本文还讨论了流数据多维数据集的设计和实现对流数据挖掘的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号