...
首页> 外文期刊>Computers,environment and urban systems >PolarHub: A large-scale web crawling engine for OGC service discovery in cyberinfrastructure
【24h】

PolarHub: A large-scale web crawling engine for OGC service discovery in cyberinfrastructure

机译:PolarHub:用于网络基础设施中OGC服务发现的大型Web爬行引擎

获取原文
获取原文并翻译 | 示例
           

摘要

The advancement of geospatial interoperability research has fostered the proliferation of geospatial resources that are shared and made publicly available on the Web. However, their increasingly availability has made the identification of the web signature of voluminous geospatial resources a major challenge. In this paper, we introduce our solution of a new cyberinfrastructure platform, the PolarHub, that conducts large-scale web crawling to discover distributed geospatial data and service resources and accomplish this goal efficiently and effectively. The PolarHub is built-upon a service-oriented architecture (SOA) and adopts Data Access Object (DAO)-based software design pattern to ensure the extendibility of the software system. The proposed meta-search-based seed selection and pattern-matching based crawling strategy facilitates the rapid resource identification and discovery through constraining the search scope on the Web. In addition, PolarHub introduces the Use of advanced asynchronous communication strategy, which combines client-pull and server-push to ensure high efficiency of the crawling system. These unique design features of PolarHub enable a high performance, scalable, sustainable, collaborative, and interactive platform for active geospatial data discovery. Because of OGC's widespread adoption, OGC-compliant web services become the primary search target of PolarHub. Currently, the PolarHub system is up and running and is serving various scientific community that demands geospatial data. We consider PolarHub a significant contribution to the field of information retrieval and geospatial interoperability. (C) 2016 Elsevier Ltd. All rights reserved.
机译:地理空间互操作性研究的发展促进了地理空间资源的激增,这些资源在Web上共享并公开可用。然而,它们日益增长的可用性使得识别大量地理空间资源的网络签名成为一个重大挑战。在本文中,我们介绍了新的网络基础架构平台PolarHub的解决方案,该平台进行大规模的Web爬网以发现分布式地理空间数据和服务资源,并有效地实现这一目标。 PolarHub建立在面向服务的体系结构(SOA)之上,并采用基于数据访问对象(DAO)的软件设计模式,以确保软件系统的可扩展性。所提出的基于元搜索的种子选择和基于模式匹配的爬网策略通过限制Web上的搜索范围,促进了快速的资源识别和发现。此外,PolarHub引入了“使用高级异步通信策略”,该策略将客户端拉动和服务器推动相结合,以确保爬网系统的高效率。 PolarHub的这些独特设计功能为主动的地理空间数据发现提供了高性能,可扩展,可持续,协作和交互式的平台。由于OGC的广泛采用,符合OGC的Web服务成为PolarHub的主要搜索目标。目前,PolarHub系统已启动并正在运行,并且正在为需要地理空间数据的各种科学界提供服务。我们认为PolarHub对信息检索和地理空间互操作性领域做出了重大贡献。 (C)2016 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号