首页> 外文会议>IEEE International Conference on e-Science >A comprehensive scenario agnostic Data LifeCycle model for an efficient data complexity management
【24h】

A comprehensive scenario agnostic Data LifeCycle model for an efficient data complexity management

机译:全面的方案不可知数据生命周期模型,可进行有效的数据复杂性管理

获取原文

摘要

There is a vast amount of data being generated every day in the world, coming from a variety of sources, with different formats, quality levels, etc. This new data, together with the archived historical data, constitute the seed for future knowledge discovery and value generation in several fields of eScience. Discovering value from data is a complex computing process where data is the key resource, not only during its processing, but also during its entire life cycle. However, there is still a huge concern about how to organize and manage this data in all fields, and at all scales, for efficient usage and exploitation during all data life cycles. Although several specific Data LifeCycle (DLC) models have been recently defined for particular scenarios, we argue that there is no global and comprehensive DLC framework to be widely used in different fields. For this reason, in this paper we present and describe a comprehensive scenario agnostic Data LifeCycle (COSA-DLC) model successfully addressing all challenges included in the 6Vs, namely Value, Volume, Variety, Velocity, Variability and Veracity, not tailored to any specific environment, but easy to be adapted to fit the requirements of any particular field. We conclude that a comprehensive scenario agnostic DLC model provides several advantages, such as facilitating global data organization and integration, easing the adaptation to any kind of scenario, guaranteeing good quality data levels, and helping save design time and efforts for the research and industrial communities.
机译:世界上每天都会产生大量数据,这些数据来自各种来源,具有不同的格式,质量等级等。这些新数据与存档的历史数据一起构成了将来知识发现和发现的种子。在多个科学领域创造价值。从数据中发现价值是一个复杂的计算过程,其中数据是关键资源,不仅在处理过程中,而且在其整个生命周期中。但是,对于如何在所有领域和各个规模组织和管理这些数据,以便在所有数据生命周期内进行有效利用和利用,仍然存在着巨大的担忧。尽管最近针对特定场景定义了几种特定的数据生命周期(DLC)模型,但我们认为尚没有可在不同领域中广泛使用的全局且全面的DLC框架。因此,在本文中,我们介绍并描述了一种全面的场景不可知数据生命周期(COSA-DLC)模型,该模型成功解决了6V中包括的所有挑战,即价值,数量,品种,速度,可变性和准确性,并非针对任何特定情况而量身定制的环境,但很容易适应任何特定领域的要求。我们得出的结论是,与场景无关的全面DLC模型具有多个优点,例如,促进了全球数据的组织和集成,简化了对任何场景的适应性,保证了高质量的数据级别,并帮助节省了设计时间,并为研究和工业界节省了精力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号