首页> 外文会议>International conference on very large data bases;VLDB 2008 >Fault-tolerant Stream Processing using a Distributed, Replicated File System
【24h】

Fault-tolerant Stream Processing using a Distributed, Replicated File System

机译:使用分布式复制文件系统的容错流处理

获取原文

摘要

We present SGuard, a new fault-tolerance technique for distributed stream processing engines (SPEs) running in clusters of commodity servers. SGuard is less disruptive to normal stream processing and leaves more resources available for normal stream processing than previous proposals. Like several previous schemes, SGuard is based on rollback recovery [18]: it checkpoints the state of stream processing nodes periodically and restarts failed nodes from their most recent checkpoints. In contrast to previous proposals, however, SGuard performs checkpoints asynchronously: i.e., operators continue processing streams during the checkpoint thus reducing the potential disruption due to the checkpointing activity. Additionally, SGuard saves the checkpointed state into a new type of distributed and replicated file system (DFS) such as GFS [22] or HDFS [9], leaving more memory resources available for normal stream processing. To manage resource contention due to simultaneous checkpoints by different SPE nodes, SGuard adds a scheduler to the DFS. This scheduler coordinates large batches of write requests in a manner that reduces indvidual checkpoint times while maintaining good overall resource utilization. We demonstrate the effectiveness of the approach through measurements of a prototype implementation in the Borealis [2] open-source SPE using HDFS [9] as the DFS.
机译:我们介绍SGuard,这是一种用于在商品服务器集群中运行的分布式流处理引擎(SPE)的新的容错技术。与以前的建议相比,SGuard对正常流处理的破坏较小,并为正常流处理留出了更多资源。像以前的几种方案一样,SGuard基于回滚恢复[18]:它定期检查流处理节点的状态,并从最近的检查点重新启动发生故障的节点。但是,与先前的提议相反,SGuard异步执行检查点:即,操作员在检查点期间继续处理流,从而减少了由于检查点活动而造成的潜在中断。另外,SGuard将检查点状态保存到一种新型的分布式和复制文件系统(DFS)中,例如GFS [22]或HDFS [9],从而为正常的流处理留下了更多的可用内存资源。为了管理由于不同SPE节点同时检查点而导致的资源争用,SGuard将调度程序添加到DFS。该调度程序以减少单个检查点时间的方式协调大量的写入请求,同时保持良好的整体资源利用率。我们通过测量使用HDFS [9]作为DFS的Borealis [2]开源SPE中的原型实现,来证明该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号