【24h】

HPBDC 2018 Keynote

机译:HPBDC 2018主题演讲

获取原文

摘要

We analyse the components that are needed in programming environments for Big Data Analysis Systems with scalable HPC performance and the functionality of ABDS – the Apache Big Data Software Stack. One highlight is Harp-DAAL which is a machine library exploiting the Intel node library DAAL and HPC communication collectives within the Hadoop ecosystem. Another highlight is Twister2 which consists of a set of middleware components to support batch or streaming data capabilities familiar from Apache Hadoop, Spark, Heron and Flink but with high performance. Twister2 covers bulk synchronous and data flow communication; task management as in Mesos, Yarn and Kubernetes; dataflow graph execution models; launching of the Harp-DAAL library; streaming and repository data access interfaces, in-memory databases and fault tolerance at dataflow nodes. Similar capabilities are available in current Apache systems but as integrated packages which do not allow needed customization for different application scenarios. We discuss the synergy between cloud management (DevOps) and cloud execution systems.
机译:我们分析具有可扩展HPC性能的大数据分析系统的编程环境中所需的组件以及ABDS的功能 - Apache大数据软件堆栈。一个突出显示是竖琴达到的机器库,它是利用Hadoop生态系统内的英特尔节点库达到和HPC通信集群。另一个突出显示是Twister2,它由一组中间件组成,以支持从Apache Hadoop,Spark,Heron和Flink熟悉的批处理或流数据功能,但具有高性能。 Twister2涵盖批量同步和数据流通信;任务管理如Mesos,Yarn和Kubernetes;数据流图执行模型;发射竖琴达瓦图书馆; Streaming和存储库数据访问接口,数据流节点处的内存数据库和容错。当前Apache系统中提供了类似的功能,但是作为不同应用方案不允许定制的集成包。我们讨论了云管理(DevOps)和云执行系统之间的协同作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号