首页> 外文会议>International IT performance and capacity conference >Performance Analysis of Big Data Analytics on Lustre and HDFS File Systems
【24h】

Performance Analysis of Big Data Analytics on Lustre and HDFS File Systems

机译:大数据分析在Lustre和HDFS文件系统上的性能分析

获取原文

摘要

Big data technology is widely used for large volume data analysis. Wide acceptance of open source Hadoop platform encourages its use for real time analytics as well; this requires high performance from the system. Moreover, most of the High Performance Computing (HPC) applications may use data analytics as well to improve its execution time by reducing the number of simulation cycles. HDFS is the traditional file system used with Hadoop while Lustre is one of the file system popularly used in HPC systems. Does the same HPC setup be used for data analytics as well? - This paper addresses this question by comparing the performance of Hive SQL and Map-Reduce job executed on Lustre and HDFS file systems. The systems are evaluated for Financial, Telecom and Insurance applications on the Intel HPDA clusters. The results are presented in the paper which shows that application performance on Lustre is at least twice better than on HDFS. The paper also discuss the impact of horizontal and vertical scaling of cluster on performance of application deployed on Lustre and HDFS file systems.
机译:大数据技术被广泛用于海量数据分析。开源Hadoop平台的广泛接受也鼓励其用于实时分析。这需要系统的高性能。而且,大多数高性能计算(HPC)应用程序也可以使用数据分析,以通过减少仿真周期数来缩短其执行时间。 HDFS是与Hadoop一起使用的传统文件系统,而Luster是在HPC系统中广泛使用的文件系统之一。是否将相同的HPC设置也用于数据分析? -本文通过比较在Luster和HDFS文件系统上执行的Hive SQL和Map-Reduce作业的性能来解决此问题。该系统针对英特尔HPDA集群上的金融,电信和保险应用进行了评估。结果显示在论文中,该论文表明Lustre上的应用程序性能至少是HDFS上的两倍。本文还讨论了集群的水平和垂直扩展对在Lustre和HDFS文件系统上部署的应用程序性能的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号