【24h】

A Dynamic Data Warehousing Platform for Creating and Accessing Biomedical Data Lakes

机译:用于创建和访问生物医学数据湖的动态数据仓库平台

获取原文
获取外文期刊封面目录资料

摘要

Medical research use cases are population centric, unlike the clinical use cases which are patient or individual centric. Hence the research use cases require accessing medical archives and data source repositories of heterogeneous nature. Traditionally, in order to query data from these data sources, users manually access and download parts or whole of the data sources. The existing solutions tend to focus on a specific data format or storage, which prevents using them for a more generic research scenario with heterogeneous data sources where the user may not have the knowledge of the schema of the data a priori. In this paper, we propose and discuss the design, implementation, and evaluation of Data Cafe, a scalable distributed architecture that aims to address the shortcomings in the existing approaches. Data Cafe lets the resource providers create biomedical data lakes from various data sources, and lets the research data users consume the data lakes efficiently and quickly without having a priori knowledge of the data schema.
机译:医学研究用例以人口为中心,与以患者或个人为中心的临床用例不同。因此,研究用例需要访问异构性质的医学档案和数据源存储库。传统上,为了从这些数据源查询数据,用户手动访问和下载部分或全部数据源。现有的解决方案倾向于专注于特定的数据格式或存储,这阻止了将它们用于具有异构数据源的更通用的研究场景,在这种情况下,用户可能没有先验的数据架构知识。在本文中,我们提出并讨论了Data Cafe的设计,实现和评估,Data Cafe是一种可扩展的分布式体系结构,旨在解决现有方法中的缺点。 Data Cafe使资源提供者可以从各种数据源创建生物医学数据湖,并允许研究数据用户高效,快速地使用数据湖,而无需事先了解数据架构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号