首页> 外文会议>IEEE International Conference on E-Science Workshops >Logical Optimization of Dataflows for Data Mining and Integration Processes
【24h】

Logical Optimization of Dataflows for Data Mining and Integration Processes

机译:数据挖掘和集成过程的数据流逻辑优化

获取原文

摘要

Modern scientific collaborations require large-scale data mining and integration processes. Their investigations involve multi-disciplinary expertise and large-scale computational experiments on top of large amounts of data that are located in distributed data repositories running various software systems, and managed by different organizations. Higher-level dataflow languages are used on top of parallel dataflow systems to enable faster program development and more maintainable code. Logical and physical optimization should be applied prior to its execution to improve performance. In this paper we present the rationale, theory, design and application of logical optimization of data flows for data mining and integration processes. A dataflow model is defined and several optimization algorithms, namely dead elements elimination, process re-ordering, parallelization, and data by-passing are developed. The first research prototype of the framework has been implemented in the context of the ADMIRE Data Mining and Integration Process Designer for logical optimization of specifications expressed in the DISPEL language developed in the ADMIRE project.
机译:现代科学合作需要大规模的数据挖掘和集成流程。他们的调查涉及多学科专业知识和大规模计算实验,位于运行各种软件系统的分布式数据存储库中的大量数据,并由不同组织管理。在并行数据流系统的顶部使用更高级别的数据流语言,以实现更快的程序开发和更可维护的代码。应在执行之前应用逻辑和物理优化以提高性能。本文介绍了数据挖掘和集成过程的数据流逻辑优化的理由,理论,设计和应用。已定义数据流模型,并开发了几种优化算法,即死区消除,处理重新排序,并行化和数据通过传递。框架的第一个研究原型已经在欣赏数据挖掘和集成过程设计人员的上下文中实施,用于欣赏项目中开发的消毒语言表达的规范的逻辑优化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号