A Generate-Test-Aggregate parallel programming library for systematic parallel programming

Yu Liu; Kento Emoto; Zhenjiang Hu

首页> 外文期刊>Parallel Computing >A Generate-Test-Aggregate parallel programming library for systematic parallel programming

【24h】

A Generate-Test-Aggregate parallel programming library for systematic parallel programming

机译：用于系统并行编程的Generate-Test-Aggregate并行编程库

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The Generate-Test-Aggregate (GTA for short) algorithm is modeled following a simple and straightforward programming pattern, for combinatorial problems. First, generate all candidates; second, test and filter out invalid ones; finally, aggregate valid ones to make the final result. These three processing steps can be specified by three building blocks namely, generator, tester, and aggregator. Despite the simplicity of algorithm design, implementing the GTA algorithm naively following the three processing steps, i.e., brute-force, will result in an exponential-cost computation, and thus it is impractical for processing large data. The theory of GTA illustrates that if the definitions of generator, tester, and aggregator satisfy certain conditions, an efficient (usually near-linear cost) MapReduce program can be automatically derived from the GTA algorithm. The principle of GTA is attractive but how to make it being practically useful, remains as an important and challenge problem due to the complexity of GTA program transformations. In this paper, we report on our studying and implementation of a practical GTA library (written in the functional language Scala) which provides a systematic parallel programming approach for big-data analysis with MapReduce. The library provides a simple functional style programming interface and hides all the internal transformations. With this library, users can write parallel programs in a sequential manner in terms of the GTA algorithm, and the efficiency of the generated MapReduce programs is guaranteed systematically. Therefore, parallel programming for many problems could become no more a tough job. We demonstrate the usefulness of our GTA library on some interesting problems involving large data and show that lots of applications can be easily and efficiently solved by using our library.

机译：遵循组合问题，采用简单明了的编程模式对Generate-Test-Aggregate（简称GTA）算法进行建模。首先，生成所有候选人；第二，测试并过滤掉无效的；最后，合计有效值以得出最终结果。这三个处理步骤可以由生成器，测试器和聚合器这三个构建块指定。尽管算法设计简单，但是按照三个处理步骤即蛮力简单地实施GTA算法将导致指数成本的计算，因此对于处理大数据是不切实际的。 GTA理论表明，如果生成器，测试器和聚合器的定义满足特定条件，则可以从GTA算法自动得出有效的（通常为近线性成本）MapReduce程序。 GTA的原理很吸引人，但是由于GTA计划转换的复杂性，如何使其实用性仍然是一个重要且具有挑战性的问题。在本文中，我们报告了我们对实用GTA库（以功能性语言Scala编写）的研究和实现，该库为使用MapReduce进行大数据分析提供了系统的并行编程方法。该库提供了一个简单的功能样式编程接口，并隐藏了所有内部转换。使用此库，用户可以根据GTA算法以顺序方式编写并行程序，从而可以系统地保证生成的MapReduce程序的效率。因此，针对许多问题的并行编程可能不再困难。我们演示了GTA库在涉及大数据的一些有趣问题上的有用性，并表明可以通过使用我们的库轻松而有效地解决许多应用程序。

著录项

来源
《Parallel Computing》 |2014年第2期|116-135|共20页
作者
Yu Liu; Kento Emoto; Zhenjiang Hu;
展开▼
作者单位

The Graduate University for Advanced Studies, Tokyo, Japan,National Institute of Informatics, Tokyo, Japan;

Kyushu Institute of Technology, Iizuka, Japan;

The Graduate University for Advanced Studies, Tokyo, Japan,National Institute of Informatics, Tokyo, Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
High-level parallel programming; Generate-Test-Aggregate algorithm; Program transformation; Program calculation; MapReduce; Functional programming;

机译：高级并行编程;生成-测试-聚合算法;程序转换;程序计算;MapReduce;功能编程;

相似文献

外文文献
中文文献
专利

1. A Parallelization Approach for Hard Real-Time Systems and Its Application on Two Industrial Programs: Strategy and Two Case Studies for the Parallelization of Hard Real-Time Systems [J] . Martin Frieb, Ralf Jahr, Haluk Ozaktas, International journal of parallel programming . 2016,第6期

机译：硬实时系统的并行化方法及其在两个工业程序中的应用：硬实时系统的并行化策略和两个案例研究
2. Shared-memory parallelization of the TURBOMOLE programs AOFORCE, ESCF, and EGRAD: How to quickly parallelize legacy code [J] . Van Wüllen C. Journal of Computational Chemistry: Organic, Inorganic, Physical, Biological . 2011,第6期

机译：TURBOMOLE程序AOFORCE，ESCF和EGRAD的共享内存并行化：如何快速并行化旧代码
3. A Practical Parallel Programming Course based on Problems of the Spanish Parallel Programming Contest [J] . Domingo Gimenez, Domingo Giménez Procedia Computer Science . 2016,第1期

机译：基于西班牙并行编程竞赛问题的实用并行编程课程
4. A Verified Generate-Test-Aggregate Coq Library for Parallel Programs Extraction [C] . Kento Emoto, Frederic Loulergue, Julien Tesson Interactive theorem proving . 2014

机译：经过验证的用于并行程序提取的生成测试聚合Coq库
5. A STUDY OF SEVEN LIBRARY TECHNOLOGY PROGRAMS TO DETERMINE TO WHAT EXTENT THE PROGRAMS PARALLEL THE AMERICAN LIBRARY ASSOCIATION CRITERIA FOR LIBRARY TECHNOLOGY PROGRAMS. [D] . SCHNEIDER, EVELYN RUTH. 1974

机译：研究七个图书馆技术计划，以确定该计划在何种程度上平行于图书馆技术计划的美国图书馆协会标准。
6. Parallels between Global Transcriptional Programs of Polarizing Caco-2 Intestinal Epithelial Cells In Vitro and Gene Expression Programs in Normal Colon and Colon Cancer [O] . Annika M. Sääf, Jennifer M. Halbleib, Xin Chen, 1888

机译：体外极化Caco-2肠上皮细胞的全球转录程序与正常结肠癌和结肠癌中的基因表达程序之间的平行性
7. Improvements on Integrating Parallelized Information into Intermediate Data Structure based on Parse Tree for Automatic Parallelizing Translator for C Programs [O] . 小倉健太郎, 甲斐宗徳 2016

机译：C语言程序自动并行翻译器基于Parse树的将并行信息集成到中间数据结构中的改进
8. Incremental Parallelization of Non-Data-Parallel Programs Using the Charon Message-Passing Library [R] . VanderWijngaart, Rob F. 2000

机译：使用Charon消息传递库增量并行化非数据并行程序

A Generate-Test-Aggregate parallel programming library for systematic parallel programming

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅