首页> 外文期刊>IEEE transactions on information technology in biomedicine >DataFoundry: information management for scientific data
【24h】

DataFoundry: information management for scientific data

机译:DataFoundry:科学数据的信息管理

获取原文
获取原文并翻译 | 示例
           

摘要

Data warehouses and data marts have been successfully applied to a multitude of commercial business applications. They have proven to be invaluable tools by integrating information from distributed, heterogeneous sources and summarizing this data for use throughout the enterprise. Although the need for information dissemination is as vital in science as in business, working warehouses in this community are scarce because traditional warehousing techniques do not transfer to scientific environments. There are two primary reasons for this difficulty. First, schema integration is more difficult for scientific databases than for business sources because of the complexity of the concepts and the associated relationships. Second, scientific data sources have highly dynamic data representations (schemata). When a data source participating in a warehouse changes its schema, both the mediator transferring data to the warehouse and the warehouse itself need to be updated to reflect these modifications. The cost of repeatedly performing these updates in a traditional warehouse, as is required in a dynamic environment, is prohibitive. The paper discusses these issues within the context of the DataFoundry project, an ongoing research effort at Lawrence Livermore National Laboratory. DataFoundry utilizes a unique integration strategy to identify corresponding instances while maintaining differences between data from different sources, and a novel architecture and an extensive meta-data infrastructure, which reduce the cost of maintaining a warehouse.
机译:数据仓库和数据集市已成功应用于众多商业业务应用程序。通过集成来自分布式,异构源的信息并将这些数据汇总以供整个企业使用,它们已被证明是无价的工具。尽管在科学界和商业界一样,对信息传播的需求至关重要,但是由于传统的仓储技术不会转移到科学环境中,因此该社区中的工作仓库稀缺。此困难有两个主要原因。首先,由于概念和关联关系的复杂性,与数据库数据库相比,对于科学数据库而言,模式集成更加困难。其次,科学数据源具有高度动态的数据表示形式(方案)。当参与仓库的数据源更改其架构时,将数据传输到仓库的中介程序和仓库本身都需要更新以反映这些修改。在动态环境中,在传统仓库中重复执行这些更新的成本令人望而却步。本文在DataFoundry项目的背景下讨论了这些问题,该项目是Lawrence Livermore国家实验室正在进行的一项研究工作。 DataFoundry利用独特的集成策略来识别相应的实例,同时保持来自不同来源的数据之间的差异,并采用新颖的体系结构和广泛的元数据基础结构,从而降低了维护仓库的成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号