...
首页> 外文期刊>Computer architecture news >DCatch: Automatically Detecting Distributed Concurrency Bugs in Cloud Systems
【24h】

DCatch: Automatically Detecting Distributed Concurrency Bugs in Cloud Systems

机译:DCatch:自动检测云系统中的分布式并发错误

获取原文
获取原文并翻译 | 示例

摘要

In big data and cloud computing era, reliability of distributed systems is extremely important. Unfortunately, distributed concurrency bugs, referred to as DCbugs, widely exist. They hide in the large state space of distributed cloud systems and manifest non-deterministically depending on the timing of distributed computation and communication. Effective techniques to detect DCbugs are desired. This paper presents a pilot solution, DCatch, in the world of DCbug detection. DCatch predicts DCbugs by analyzing correct execution of distributed systems. To build DCatch, we design a set of happens-before rules that model a wide variety of communication and concurrency mechanisms in real-world distributed cloud systems. We then build runtime tracing and trace analysis tools to effectively identify concurrent conflicting memory accesses in these systems. Finally, we design tools to help prune false positives and trigger DCbugs. We have evaluated DCatch on four representative open-source distributed cloud systems, Cassandra, Hadoop MapRe-duce, HBase, and ZooKeeper. By monitoring correct execution of seven workloads on these systems, DCatch reports 32 DCbugs, with 20 of them being truly harmful.
机译:在大数据和云计算时代,分布式系统的可靠性至关重要。不幸的是,广泛存在分布式并发错误,称为DCbug。它们隐藏在分布式云系统的大型状态空间中,并根据分布式计算和通信的时间不确定地显示。需要有效的技术来检测DCbug。本文介绍了DCbug检测领域的一种试验性解决方案DCatch。 DCatch通过分析分布式系统的正确执行来预测DCbug。为了构建DCatch,我们设计了一组事前发生的规则,这些规则为现实世界中的分布式云系统中的各种通信和并发机制建模。然后,我们构建运行时跟踪和跟踪分析工具,以有效地识别这些系统中并发冲突的内存访问。最后,我们设计工具以帮助修剪误报和触发DCbug。我们已经在四个有代表性的开源分布式云系统Cassandra,Hadoop MapRe-duce,HBase和ZooKeeper上评估了DCatch。通过监视这些系统上七个工作负载的正确执行,DCatch报告了32个DCbug,其中有20个确实有害。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号