首页> 外文学位 >Big Data analytics in static and streaming provenance.
【24h】

Big Data analytics in static and streaming provenance.

机译:静态和流媒体来源的大数据分析。

获取原文
获取原文并翻译 | 示例

摘要

With recent technological and computational advances, scientists increasingly integrate sensors and model simulations to understand spatial, temporal, social, and ecological relationships at unprecedented scale. Data provenance traces relationships of entities over time, thus providing a unique view on over-time behavior under study. However, provenance can be overwhelming in both volume and complexity; the now forecasting potential of provenance creates additional demands.;This dissertation focuses on Big Data analytics of static and streaming provenance. It develops filters and a non-preprocessing slicing technique for in-situ querying of static provenance. It presents a stream processing framework for online processing of provenance data at high receiving rate. While the former is sufficient for answering queries that are given prior to the application start (forward queries), the latter deals with queries whose targets are unknown beforehand (backward queries). Finally, it explores data mining on large collections of provenance and proposes a temporal representation of provenance that can reduce the high dimensionality while effectively supporting mining tasks like clustering, classification and association rules mining; and the temporal representation can be further applied to streaming provenance as well. The proposed techniques are verified through software prototypes applied to Big Data provenance captured from computer network data, weather models, ocean models, remote (satellite) imagery data, and agent-based simulations of agricultural decision making.
机译:随着最新技术和计算技术的进步,科学家越来越多地集成传感器和模型仿真,以前所未有的规模了解空间,时间,社会和生态之间的关系。数据来源跟踪实体在一段时间内的关系,从而提供有关正在研究的超时行为的独特视图。但是,无论从数量上还是从复杂性上来说,出处都不胜枚举。目前对物源的预测潜力带来了更多需求。本文主要研究静态和流源物的大数据分析。它开发了用于静态来源原位查询的过滤器和非预处理切片技术。它提供了一种流处理框架,用于以高接收率在线处理来源数据。前者足以回答在应用程序启动之前给出的查询(正向查询),而后者处理的目标是事先未知的查询(向后查询)。最后,它探索了对大量物源的数据挖掘,并提出了一种物源的时间表示,它可以减少高维,同时有效地支持诸如聚类,分类和关联规则挖掘之类的挖掘任务;时间表示也可以进一步应用于流媒体来源。通过应用于从计算机网络数据,天气模型,海洋模型,远程(卫星)图像数据以及基于代理的农业决策模拟中捕获的大数据出处的软件原型,验证了提出的技术。

著录项

  • 作者

    Chen, Peng.;

  • 作者单位

    Indiana University.;

  • 授予单位 Indiana University.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 191 p.
  • 总页数 191
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号