首页> 外文期刊>Software >EDISON-DATA: A flexible and extensible platform for processing and analysis of computational science data
【24h】

EDISON-DATA: A flexible and extensible platform for processing and analysis of computational science data

机译:EDISON-DATA:一个灵活且可扩展的平台,用于处理和分析计算科学数据

获取原文
获取原文并翻译 | 示例
           

摘要

With the recent emergence of new paradigm, ie, open science and big data, the need for data sharing and collaboration is becoming important in the computational science field as well. The EDISON-DATA platform aims to provide services that computational simulation data can easily published, preserved, shared, reused, discovered, and analyzed. First, this paper analyzed computational science platform-related issues, obtained during the development of the EDISON-DATA platform, regarding the sharing and reusing of the computational science data. These issues include data complexity, diversity, reliability, heterogeneity, etc. To solve the above issues and support data analysis in an efficient and integrated manner, this study proposes various ideas used in the EDISON-DATA platform. First, we suggested an automated preprocessing framework to handle the complexity of computational science data. Second, to solve the diversity issue, we presented ways to develop preprocessing logic and data presentation logic customized for each data type. Third, to improve the reliability of computational science data, some quality control and provenance management techniques were presented. Fourth, we proposed a way to manage related data in groups. Fifth, to solve data heterogeneity problem and to analyze data in an integrated way, we let the preprocessing framework to use controlled vocabularies to express descriptive metadata. Lastly, we demonstrated feasibility and usability of the proposed ideas in this paper by presenting a case study of building a research portal service in the materials field based on the EDISON-DATA platform.
机译:随着新的范式,即开放科学和大数据的出现,在计算科学领域,数据共享和协作的需求也变得越来越重要。 EDISON-DATA平台旨在提供计算仿真数据可以轻松发布,保存,共享,重用,发现和分析的服务。首先,本文分析了在EDISON-DATA平台开发过程中获得的与计算科学平台相关的问题,这些问题涉及计算科学数据的共享和重用。这些问题包括数据复杂性,多样性,可靠性,异构性等。为解决上述问题并以有效和集成的方式支持数据分析,本研究提出了在EDISON-DATA平台中使用的各种思路。首先,我们提出了一个自动化的预处理框架来处理计算科学数据的复杂性。其次,为了解决多样性问题,我们介绍了开发针对每种数据类型定制的预处理逻辑和数据表示逻辑的方法。第三,为了提高计算科学数据的可靠性,提出了一些质量控制和物源管理技术。第四,我们提出了一种分组管理相关数据的方法。第五,为了解决数据异构性问题并以集成方式分析数据,我们让预处理框架使用受控词汇表述描述性元数据。最后,通过以EDISON-DATA平台为基础,在材料领域构建研究门户服务的案例研究,我们证明了本文提出的想法的可行性和可用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号