首页> 外文学位 >An adaptable repository for complex scientific metadata.
【24h】

An adaptable repository for complex scientific metadata.

机译:适用于复杂科学元数据的适应性存储库。

获取原文
获取原文并翻译 | 示例

摘要

The explosive growth in computational science has resulted in a broad spectrum of scientific communities realizing the need to capture and preserve the deluge of data being generated and the metadata that describe them. Metadata is recognized as being as essential - leading to detailed metadata specifications defined as XML schemata. However, although the volume of data keeps growing, the metadata that describes it has not kept pace. In part this is due to the incentive misalignment between when metadata is generated and when it has value. Metadata is ephemeral and must be captured as an experiment runs, but the value of the data (and the metadata used to describe or use it) is often unknown -- possibly for decades.;Existing middleware for cataloging metadata across scientific domains necessarily takes a generic approach and cannot communicate using domain-specific schemata without customized middleware. This dissertation presents a different approach based on the thesis that although scientific metadata schemata are domain-specific, they share commonalities that differentiate them from other schemata, metadata in non-scientific domains, or general XML. Key characteristics of scientific metadata schemata that we identify are their composition based on unordered independent concepts, and the need to incrementally capture metadata based on concepts. Additionally, unlike data communicated as XML, scientific discovery metadata serves as a search index to locate relevant data sets. Based on these commonalities in both the structure and use of scientific metadata, we show that scientific metadata schemata can be partitioned into sets of unordered metadata concepts -- enabling a global ordering of concepts that we exploit in a generalized framework that is a hybrid of approaches used to store XML. This hybrid approach enables both detailed search queries over the metadata and efficient reconstruction of XML in response to queries. This approach is validated through the XMC Cat metadata catalog which uses a lightweight SOA-based architecture and can be deployed for varied scientific schemata through configuration instead of customized middleware. We present a prototype of the XMC Cat Builder which guides the user through generating the necessary configuration based on a domain XML schema using a point-and-click, web-based interface.
机译:计算科学的爆炸性增长导致广泛的科学界意识到需要捕获和保存大量的正在生成的数据以及描述它们的元数据。元数据被认为是必不可少的-导致将详细的元数据规范定义为XML模式。但是,尽管数据量一直在增长,但是描述它的元数据却并没有跟上步伐。部分原因是由于元数据何时生成和何时具有价值之间的激励失调。元数据是临时性的,必须在实验运行时捕获,但是数据的价值(以及用于描述或使用它的元数据)通常是未知的-可能长达数十年。现有的跨科学领域对元数据进行分类的中间件必须花费大量时间。如果没有定制的中间件,则无法使用域特定的架构进行通信。本文基于以下观点提出了一种不同的方法,即尽管科学元数据图式是特定于领域的,但它们具有共同点,可以将它们与其他图式,非科学领域的元数据或通用XML区别开来。我们确定的科学元数据纲要的关键特征是其基于无序独立概念的组成,以及基于概念增量捕获元数据的需求。另外,与以XML进行通信的数据不同,科学发现元数据用作查找相关数据集的搜索索引。基于科学元数据的结构和使用上的这些共性,我们表明科学元数据图式可以划分为无序元数据概念集-使我们可以在混合使用方法的广义框架中对概念进行全局排序用于存储XML。这种混合方法既可以对元数据进行详细的搜索查询,又可以响应查询而有效地重建XML。通过使用基于SOA的轻量级架构的XMC Cat元数据目录可以验证此方法,并且可以通过配置而不是定制的中间件将其部署为各种科学方案。我们提供了XMC Cat Builder的原型,该原型可指导用户通过使用基于Web的点击界面,基于域XML架构生成必要的配置。

著录项

  • 作者

    Jensen, Scott.;

  • 作者单位

    Indiana University.;

  • 授予单位 Indiana University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 206 p.
  • 总页数 206
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号