首页> 外文会议>Provenance and Annotation of Data; Lecture Notes in Computer Science; 4145 >Mapping Physical Formats to Logical Models to Extract Data and Metadata: The Defuddle Parsing Engine
【24h】

Mapping Physical Formats to Logical Models to Extract Data and Metadata: The Defuddle Parsing Engine

机译:将物理格式映射到逻辑模型以提取数据和元数据:Defuddle解析引擎

获取原文
获取原文并翻译 | 示例

摘要

Scientists, motivated by the desire for systems-level understanding of phenomena, increasingly need to share their results across multiple disciplines. Accomplishing this requires data to be annotated, contextualized, and readily searchable and translated into other formats. While these requirements can be addressed by custom programming or obviated by community standardization, neither approach has 'solved' the problem. In this paper, we describe a complementary approach - a general capability for articulating the format of arbitrary textual and binary data using a logical data model, expressed in XML-Schema, which can be used to provide annotation and context, extract metadata, and enable translation. This work is based on the draft specification for the Data Format Description Language and our open source "Defuddle" parser. We present an overview of the specification, detail the design of Defuddle, and discuss the benefits and challenges of this general approach to enabling discovery, sharing, and interpretation of diverse data sets.
机译:出于对现象的系统级理解的渴望,科学家越来越需要在多个学科之间共享其结果。要做到这一点,就需要对数据进行注释,上下文化,易于搜索并将其转换为其他格式。尽管这些要求可以通过自定义编程来解决,也可以通过社区标准化来消除,但两种方法都无法“解决”该问题。在本文中,我们描述了一种补充方法-一种使用XML Schema表示的逻辑数据模型表达任意文本和二进制数据格式的通用功能,可用于提供注释和上下文,提取元数据并启用翻译。这项工作基于数据格式描述语言的规范草案和我们的开源“ Defuddle”解析器。我们提供了规范的概述,详细介绍了Defuddle的设计,并讨论了这种通用方法在发现,共享和解释各种数据集方面的优势和挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号