...
首页> 外文期刊>International Journal of Cooperative Information Systems >A Distributed Infrastructure for Earth-Science Big Data Retrieval
【24h】

A Distributed Infrastructure for Earth-Science Big Data Retrieval

机译:地学大数据检索的分布式基础架构

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Earth-Science data are composite, multi-dimensional and of significant size, and as such, continue to pose a number of ongoing problems regarding their management. With new and diverse information sources emerging as well as rates of generated data continuously increasing, a persistent challenge becomes more pressing: To make the information existing in multiple heterogeneous resources readily available. The widespread use of the XML data-exchange format has enabled the rapid accumulation of semi-structured metadata for Earth-Science data. In this paper, we exploit this popular use of XML and present the means for querying metadata emanating from multiple sources in a succinct and effective way. Thereby, we release the user from the very tedious and time consuming task of examining individual XML descriptions one by one. Our approach, termed Meta-Array Data Search (MAD Search), brings together diverse data sources while enhancing the user-friendliness of the underlying information sources. We gather metadata using different standards and construct an amalgamated service with the help of tools that discover and harvest such metadata; this service facilitates the end-user by offering easy and timely access to all metadata. The main contribution of our work is a novel query language termed xWCPS, that builds on top of two widely-adopted standards: XQuery and the Web Coverage Processing Service (WCPS). xWCPS furnishes a rich set of features regarding the way scientific data can be queried with. Our proposed unified language allows for requesting metadata while also giving processing directives. Consequently, the xWCPS-enabled MAD Search helps in both retrieval and processing of large data sets hosted in an heterogeneous infrastructure. We demonstrate the effectiveness of our approach through diverse use-cases that provide insights into the syntactic power and overall expressiveness of xWCPS. We evaluate MAD Search in a distributed environment that comprises five high-volume array-databases whose sizes range between 20 and 100 GB and so, we ascertain the applicability and potential of our proposal.
机译:地学数据是复合的,多维的并且规模巨大,因此,在其管理方面继续存在许多持续的问题。随着新的和多样化的信息源的出现以及所生成数据的速率不断增加,持续的挑战变得更加紧迫:要使存在于多种异构资源中的信息易于获得。 XML数据交换格式的广泛使用已实现了对Earth-Science数据的半结构化元数据的快速积累。在本文中,我们利用XML的这种流行用法,并提出了一种简洁有效的方法来查询从多个源发出的元数据。因此,我们使用户摆脱了繁琐而耗时的任务,即一步一步地检查各个XML描述。我们的方法称为元数组数据搜索(MAD Search),它将各种数据源整合在一起,同时增强了基础信息源的用户友好性。我们使用不同的标准收集元数据,并借助发现和收集此类元数据的工具来构建合并服务;通过提供对所有元数据的便捷访问,此服务为最终用户提供了便利。我们工作的主要贡献是一种称为xWCPS的新颖查询语言,它基于两种广泛采用的标准:XQuery和Web Coverage Processing Service(WCPS)。 xWCPS提供了有关查询科学数据的方式的丰富功能。我们提出的统一语言允许请求元数据,同时还提供处理指令。因此,启用xWCPS的MAD搜索有助于检索和处理异构基础结构中托管的大型数据集。我们通过各种用例展示了我们方法的有效性,这些用例提供了xWCPS的语法功能和整体表达能力的见解。我们在分布式环境中评估MAD搜索,该环境包含五个容量在20到100 GB之间的高容量数组数据库,因此,我们确定了该建议的适用性和潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号