【24h】

Locating Data Sources in Large Distributed Systems

机译:在大型分布式系统中查找数据源

获取原文
获取原文并翻译 | 示例

摘要

Querying large numbers of data sources is gaining importance due to increasing numbers of independent data providers. One of the key challenges is executing queries on all relevant information sources in a scalable fashion and retrieving fresh results. The key to scalability is to send queries only to the relevant servers and avoid wasting resources on data sources which will not provide any results. Thus, a catalog service, which would determine the relevant data sources given a query, is an essential component in efficiently processing queries in a distributed environment. This paper proposes a catalog framework which is distributed across the data sources themselves and does not require any central infrastructure. As new data sources become available, they automatically become part of the catalog service infrastructure, which allows scalability to large numbers of nodes. Furthermore, we propose techniques for workload adaptability. Using simulation and real-world data we show that our approach is valid and can scale to thousands of data sources.
机译:由于越来越多的独立数据提供者,查询大量数据源变得越来越重要。关键挑战之一是以可伸缩的方式对所有相关信息源执行查询并获取新结果。可伸缩性的关键是仅将查询发送到相关服务器,并避免浪费不会提供任何结果的数据源资源。因此,将在查询中确定相关数据源的目录服务是在分布式环境中有效处理查询的基本组件。本文提出了一个目录框架,该框架分布在数据源本身中,并且不需要任何中央基础结构。随着新数据源的出现,它们将自动成为目录服务基础结构的一部分,从而可以扩展到大量节点。此外,我们提出了工作负载适应性的技术。使用仿真和现实数据,我们证明了我们的方法是有效的,并且可以扩展到成千上万个数据源。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号