...
首页> 外文期刊>International journal of parallel programming >Improving Performance on Data-Intensive Applications Using a Load Balancing Methodology Based on Divisible Load Theory
【24h】

Improving Performance on Data-Intensive Applications Using a Load Balancing Methodology Based on Divisible Load Theory

机译:使用基于可分负载理论的负载均衡方法提高数据密集型应用程序的性能

获取原文
获取原文并翻译 | 示例
           

摘要

Data-intensive applications are those that explore, query, analyze, and, in general, process very large data sets. Generally, these applications can be naturally implemented in parallel but, in many cases, these implementations show severe performance problems mainly due to load imbalances, inefficient use of available resources, and improper data partition policies. It is worth noticing that the problem becomes more complex when the conditions causing these problems change at run time. This paper proposes a methodology for dynamically improving the performance of certain data-intensive applications based on: adapting the size and number of data partitions, and the number of processing nodes, to the current application conditions in homogeneous clusters. To this end, the processing of each exploration is monitored and gathered data is used to dynamically tune the performance of the application. The tuning parameters included in the methodology are: (ⅰ) the partition factor of the data set, (ⅱ) the distribution of the data chunks, and (ⅲ) the number of processing nodes to be used. The methodology assumes that a single execution includes multiple related explorations on the same partitioned data set, and that data chunks are ordered according to their processing times during the application execution to assign first the most time consuming partitions. The methodology has been validated using the well-known bioinformatics tool-BLAST-and through extensive experimentation using simulation. Reported results are encouraging in terms of reducing total execution time of the application (up to a 40 % in some cases).
机译:数据密集型应用程序是那些探索,查询,分析并通常处理非常大的数据集的应用程序。通常,这些应用程序自然可以并行实现,但是在许多情况下,这些实现方案会显示出严重的性能问题,这主要是由于负载不平衡,可用资源使用效率低下以及数据分区策略不正确造成的。值得注意的是,当导致这些问题的条件在运行时改变时,问题变得更加复杂。本文提出了一种基于以下条件的动态改进某些数据密集型应用程序性能的方法:根据同质集群中的当前应用程序条件调整数据分区的大小和数量以及处理节点的数量。为此,将监视每个探查的处理,并使用收集的数据来动态调整应用程序的性能。该方法中包括的调整参数是:(ⅰ)数据集的分区因子,(ⅱ)数据块的分布,以及(ⅲ)要使用的处理节点数。该方法假设单个执行包括对同一个分区数据集的多个相关探索,并且数据块在应用程序执行期间根据其处理时间进行排序,以首先分配最耗时的分区。该方法已使用著名的生物信息学工具BLAST进行了验证,并通过使用模拟的广泛实验进行了验证。就减少应用程序的总执行时间(在某些情况下可高达40%)而言,报告结果令人鼓舞。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号