首页> 外文期刊>BMC Bioinformatics >Scenario driven data modelling: a method for integrating diverse sources of data and data streams
【24h】

Scenario driven data modelling: a method for integrating diverse sources of data and data streams

机译:场景驱动的数据建模:一种集成各种数据源和数据流的方法

获取原文
           

摘要

BackgroundBiology is rapidly becoming a data intensive, data-driven science. It is essential that data is represented and connected in ways that best represent its full conceptual content and allows both automated integration and data driven decision-making. Recent advancements in distributed multi-relational directed graphs, implemented in the form of the Semantic Web make it possible to deal with complicated heterogeneous data in new and interesting ways.ResultsThis paper presents a new approach, scenario driven data modelling (SDDM), that integrates multi-relational directed graphs with data streams. SDDM can be applied to virtually any data integration challenge with widely divergent types of data and data streams. In this work, we explored integrating genetics data with reports from traditional media. SDDM was applied to the New Delhi metallo-beta-lactamase gene (NDM-1), an emerging global health threat. The SDDM process constructed a scenario, created a RDF multi-relational directed graph that linked diverse types of data to the Semantic Web, implemented RDF conversion tools (RDFizers) to bring content into the Sematic Web, identified data streams and analytical routines to analyse those streams, and identified user requirements and graph traversals to meet end-user requirements.ConclusionsWe provided an example where SDDM was applied to a complex data integration challenge. The process created a model of the emerging NDM-1 health threat, identified and filled gaps in that model, and constructed reliable software that monitored data streams based on the scenario derived multi-relational directed graph. The SDDM process significantly reduced the software requirements phase by letting the scenario and resulting multi-relational directed graph define what is possible and then set the scope of the user requirements. Approaches like SDDM will be critical to the future of data intensive, data-driven science because they automate the process of converting massive data streams into usable knowledge.
机译:背景生物学正在迅速成为数据密集型,数据驱动型科学。必须以最能代表其全部概念内容的方式来表示和连接数据,并允许自动集成和数据驱动的决策制定。以语义网的形式实现的分布式多关系有向图的最新进展使以新的有趣方式处理复杂的异构数据成为可能。结果本文提出了一种新方法,即场景驱动数据建模(SDDM),该方法集成了具有数据流的多关系有向图。 SDDM几乎可以应用于任何类型的数据和数据流类型广泛的数据集成挑战。在这项工作中,我们探索了将遗传数据与传统媒体报道相结合的过程。 SDDM已应用于新兴的全球健康威胁新德里金属β-内酰胺酶基因(NDM-1)。 SDDM流程构建了一个场景,创建了RDF多重关系有向图,该图将各种类型的数据链接到语义Web,实现了RDF转换工具(RDFizers)将内容带入Sematic Web,确定了数据流和分析例程以对其进行分析流,并确定用户需求和遍历图以满足最终用户需求。结论我们提供了一个示例,其中将SDDM应用于复杂的数据集成挑战。该过程创建了一个针对正在出现的NDM-1健康威胁的模型,识别并填补了该模型中的空白,并构建了可靠的软件,该软件基于从场景导出的多关系有向图来监视数据流。 SDDM流程通过让场景和所得的多关系有向图定义可能的内容,然后设置用户需求的范围,从而大大减少了软件需求阶段。 SDDM之类的方法对于数据密集型数据驱动型科学的未来至关重要,因为它们可以自动将海量数据流转换为可用知识的过程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号