首页> 外文期刊>Parallel Computing >A Generate-Test-Aggregate parallel programming library for systematic parallel programming
【24h】

A Generate-Test-Aggregate parallel programming library for systematic parallel programming

机译:用于系统并行编程的Generate-Test-Aggregate并行编程库

获取原文
获取原文并翻译 | 示例

摘要

The Generate-Test-Aggregate (GTA for short) algorithm is modeled following a simple and straightforward programming pattern, for combinatorial problems. First, generate all candidates; second, test and filter out invalid ones; finally, aggregate valid ones to make the final result. These three processing steps can be specified by three building blocks namely, generator, tester, and aggregator. Despite the simplicity of algorithm design, implementing the GTA algorithm naively following the three processing steps, i.e., brute-force, will result in an exponential-cost computation, and thus it is impractical for processing large data. The theory of GTA illustrates that if the definitions of generator, tester, and aggregator satisfy certain conditions, an efficient (usually near-linear cost) MapReduce program can be automatically derived from the GTA algorithm. The principle of GTA is attractive but how to make it being practically useful, remains as an important and challenge problem due to the complexity of GTA program transformations. In this paper, we report on our studying and implementation of a practical GTA library (written in the functional language Scala) which provides a systematic parallel programming approach for big-data analysis with MapReduce. The library provides a simple functional style programming interface and hides all the internal transformations. With this library, users can write parallel programs in a sequential manner in terms of the GTA algorithm, and the efficiency of the generated MapReduce programs is guaranteed systematically. Therefore, parallel programming for many problems could become no more a tough job. We demonstrate the usefulness of our GTA library on some interesting problems involving large data and show that lots of applications can be easily and efficiently solved by using our library.
机译:遵循组合问题,采用简单明了的编程模式对Generate-Test-Aggregate(简称GTA)算法进行建模。首先,生成所有候选人;第二,测试并过滤掉无效的;最后,合计有效值以得出最终结果。这三个处理步骤可以由生成器,测试器和聚合器这三个构建块指定。尽管算法设计简单,但是按照三个处理步骤即蛮力简单地实施GTA算法将导致指数成本的计算,因此对于处理大数据是不切实际的。 GTA理论表明,如果生成器,测试器和聚合器的定义满足特定条件,则可以从GTA算法自动得出有效的(通常为近线性成本)MapReduce程序。 GTA的原理很吸引人,但是由于GTA计划转换的复杂性,如何使其实用性仍然是一个重要且具有挑战性的问题。在本文中,我们报告了我们对实用GTA库(以功能性语言Scala编写)的研究和实现,该库为使用MapReduce进行大数据分析提供了系统的并行编程方法。该库提供了一个简单的功能样式编程接口,并隐藏了所有内部转换。使用此库,用户可以根据GTA算法以顺序方式编写并行程序,从而可以系统地保证生成的MapReduce程序的效率。因此,针对许多问题的并行编程可能不再困难。我们演示了GTA库在涉及大数据的一些有趣问题上的有用性,并表明可以通过使用我们的库轻松而有效地解决许多应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号