首页> 外文学位 >A flexible data mining architecture for monitoring data streams.
【24h】

A flexible data mining architecture for monitoring data streams.

机译:灵活的数据挖掘体系结构,用于监视数据流。

获取原文
获取原文并翻译 | 示例

摘要

Data streams are ubiquitous: performance measurements in business process management, faults and alarms in network traffic management, transactions in retail chains, ATM operations in banks, log records generated by web servers, and sensor network data are some specific examples. In almost all of these applications, the data volume is massive, up to several terabytes. Data volume increases even further with the rapid arrival of new tuples. Traditional DBMS's are ill-equipped for processing of data streams in real time, and do not provide adequate support for handling continuous queries posed over these streams.; This dissertation outlines models and issues towards designing an efficient Data Stream Management System (DSMS) called Stardust. The system can handle a diverse set of continuous queries that fit naturally into the mold of data stream applications. We developed wavelet-based approximation schemes that maintain multiple levels of information over streams of data in order to answer queries efficiently.; In centralized DSMS models, a stream is summarized at a central site, and all user queries are processed at this site. In data and query intensive environments, the central site can become a bottleneck. As a remedy to this problem, we developed adaptive replication algorithms for dissemination of stream summaries computed at a central site to interested clients. We tested the distributed version of the system on a number of testbeds. In the first scenario, Stardust exploits the scalability and load balancing of communication provided by content-based routing schemes for efficient distributed stream processing. In the second scenario, we integrated Stardust into a real-time decision support system for nondestructive health monitoring using a wireless network of sensors. The system trades off accuracy for efficient processing of sensor data in order to save the communication overhead and power-consumption.; Finally, we built an event detection framework for monitoring a set of distributed network elements. The goal is to detect potentially interesting incidents specified by users in terms of a multitude of race conditions across a set of routers while maintaining a low monitoring overhead.
机译:数据流无处不在:业务流程管理中的性能测量,网络流量管理中的故障和警报,零售链中的交易,银行中的ATM操作,Web服务器生成的日志记录以及传感器网络数据是一些特定示例。在几乎所有这些应用程序中,数据量非常大,高达数TB。随着新元组的快速到来,数据量甚至进一步增加。传统的DBMS装备不完善,无法实时处理数据流,并且不能为处理这些数据流上的连续查询提供足够的支持。本文概述了设计称为Stardust的高效数据流管理系统(DSMS)的模型和问题。该系统可以处理各种各样的连续查询,这些查询自然适合数据流应用程序的模型。我们开发了基于小波的近似方案,该方案在数据流上维护多个级别的信息,以便有效地回答查询。在集中式DSMS模型中,流是在中央站点上汇总的,所有用户查询都在此站点上处理。在数据和查询密集型环境中,中心站点可能会成为瓶颈。为了解决这个问题,我们开发了自适应复制算法,用于将在中心站点计算出的流摘要分发给感兴趣的客户。我们在许多测试平台上测试了系统的分布式版本。在第一种情况下,Stardust利用基于内容的路由方案提供的通信的可伸缩性和负载平衡来进行有效的分布式流处理。在第二种情况下,我们将Stardust集成到实时决策支持系统中,以使用传感器无线网络进行无损健康监控。该系统在精度上进行权衡以有效处理传感器数据,以节省通信开销和功耗。最后,我们建立了一个事件检测框架来监视一组分布式网络元素。目的是检测用户根据一组路由器之间的多种竞争状况指定的潜在有趣事件,同时保持较低的监视开销。

著录项

  • 作者

    Bulut, Ahmet.;

  • 作者单位

    University of California, Santa Barbara.;

  • 授予单位 University of California, Santa Barbara.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 230 p.
  • 总页数 230
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号