【24h】

HPBDC 2018 Keynote

机译:HPBDC 2018主题演讲

获取原文

摘要

We analyse the components that are needed in programming environments for Big Data Analysis Systems with scalable HPC performance and the functionality of ABDS – the Apache Big Data Software Stack. One highlight is Harp-DAAL which is a machine library exploiting the Intel node library DAAL and HPC communication collectives within the Hadoop ecosystem. Another highlight is Twister2 which consists of a set of middleware components to support batch or streaming data capabilities familiar from Apache Hadoop, Spark, Heron and Flink but with high performance. Twister2 covers bulk synchronous and data flow communication; task management as in Mesos, Yarn and Kubernetes; dataflow graph execution models; launching of the Harp-DAAL library; streaming and repository data access interfaces, in-memory databases and fault tolerance at dataflow nodes. Similar capabilities are available in current Apache systems but as integrated packages which do not allow needed customization for different application scenarios. We discuss the synergy between cloud management (DevOps) and cloud execution systems.
机译:我们分析具有可扩展HPC性能和ABDS(Apache大数据软件堆栈)功能的大数据分析系统的编程环境中所需的组件。 Harp-DAAL是一个亮点,它是一个利用Hadoop生态系统中的Intel节点库DAAL和HPC通信集合体的机器库。另一个亮点是Twister2,它由一组中间件组件组成,以支持Apache Hadoop,Spark,Heron和Flink熟悉的但具有高性能的批处理或流数据功能。 Twister2涵盖了批量同步和数据流通信。如Mesos,Yarn和Kubernetes中的任务管理;数据流图执行模型;启动Harp-DAAL库;流和存储库数据访问接口,内存数据库以及数据流节点的容错能力。当前的Apache系统中提供了类似的功能,但它们是集成的软件包,不允许针对不同的应用场景进行所需的自定义。我们讨论了云管理(DevOps)和云执行系统之间的协同作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号