首页> 美国卫生研究院文献>other >BigDebug: Debugging Primitives for Interactive Big Data Processing in Spark
【2h】

BigDebug: Debugging Primitives for Interactive Big Data Processing in Spark

机译:BigDebug:用于Spark中交互式大数据处理的调试原语

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Developers use cloud computing platforms to process a large quantity of data in parallel when developing big data analytics. Debugging the massive parallel computations that run in today’s data-centers is time consuming and error-prone. To address this challenge, we design a set of interactive, real-time debugging primitives for big data processing in Apache Spark, the next generation data-intensive scalable cloud computing platform. This requires re-thinking the notion of step-through debugging in a traditional debugger such as gdb, because pausing the entire computation across distributed worker nodes causes significant delay and naively inspecting millions of records using a watchpoint is too time consuming for an end user.First, BIGDEBUG’s simulated breakpoints and on-demand watchpoints allow users to selectively examine distributed, intermediate data on the cloud with little overhead. Second, a user can also pinpoint a crash-inducing record and selectively resume relevant sub-computations after a quick fix. Third, a user can determine the root causes of errors (or delays) at the level of individual records through a fine-grained data provenance capability. Our evaluation shows that BIGDEBUG scales to terabytes and its record-level tracing incurs less than 25% overhead on average. It determines crash culprits orders of magnitude more accurately and provides up to 100% time saving compared to the baseline replay debugger. The results show that BIGDEBUG supports debugging at interactive speeds with minimal performance impact.
机译:开发大数据分析时,开发人员使用云计算平台并行处理大量数据。调试当今数据中心中运行的大量并行计算非常耗时且容易出错。为了应对这一挑战,我们为下一代数据密集型可扩展云计算平台Apache Spark中的大数据处理设计了一套交互式实时调试原语。这需要重新考虑传统调试器(例如 gdb )中的逐步调试的概念,因为在分布式工作节点上暂停整个计算会导致严重的延迟,并且也无法使用观察点天真地检查数百万条记录首先,BIGDEBUG的模拟断点和按需观察点使用户能够以很少的开销有选择地检查云上的分布式中间数据。其次,用户还可以精确定位导致崩溃的记录,并在快速修复后有选择地恢复相关的子计算。第三,用户可以通过细粒度的数据来源功能确定单个记录级别的错误(或延迟)的根本原因。我们的评估表明,BIGDEBUG可扩展至TB,其记录级跟踪平均不到25%的开销。与基线重播调试器相比,它可以更准确地确定崩溃原因的数量级,并节省多达100%的时间。结果表明,BIGDEBUG支持以交互速度进行调试,而对性能的影响最小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号