A Speculative Parallel DFA Membership Test for Multicore, SIMD and Cloud Computing Environments

Yousun Ko; Minyoung Jung; Yo-Sub Han; Bernd Burgstaller

首页> 外文期刊>International journal of parallel programming >A Speculative Parallel DFA Membership Test for Multicore, SIMD and Cloud Computing Environments

【24h】

A Speculative Parallel DFA Membership Test for Multicore, SIMD and Cloud Computing Environments

机译：针对多核，SIMD和云计算环境的推测性并行DFA成员资格测试

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present techniques to parallelize membership tests for Deterministic Finite Automata (DFAs). Our method searches arbitrary regular expressions by matching multiple bytes in parallel using speculation. We partition the input string into chunks, match chunks in parallel, and combine the matching results. Our parallel matching algorithm exploits structural DFA properties to minimize the speculative overhead. Unlike previous approaches, our speculation is failure-free, i.e., (1) sequential semantics are maintained, and (2) speed-downs are avoided altogether. On architectures with a SIMD gather-operation for indexed memory loads, our matching operation is fully vectorized. The proposed load-balancing scheme uses an off-line profiling step to determine the matching capacity of each participating processor. Based on matching capacities, DFA matches are load-balanced on inhomogeneous parallel architectures such as cloud computing environments. We evaluated our speculative DFA membership test for a representative set of benchmarks from the Perl-compatible Regular Expression (PCRE) library and the PROSITE protein database. Evaluation was conducted on a 4 CPU (40 cores) shared-memory node of the Intel Academic Program Manycore Testing Lab (Intel MTL), on the Intel AVX2 SDE simulator for 8-way fully vectorized SIMD execution, and on a 20-node (288 cores) cluster on the Amazon EC2 computing cloud. Obtained speedups are on the order of O (1 + |P|-1/|Q|·γ), where | P | denotes the number of processors or SIMD units, | Q | denotes the number of DFA states, and 0 ＜ γ ≤ 1 represents a statically computed DFA property. For all observed cases, we found that 0.02 ＜ γ ＜ 0.47. Actual speedups range from 2.3 × to 38.8× for up to 512 DFA states for PCRE, and between 1.3× and 19.9 × for up to 1,288 DFA states for PROSITE on a 40-core MTL node. Speedups on the EC2 computing cloud range from 5.0× to 65.8× for PCRE, and from 5.0× to 138.5× for PROSITE. Speedups of our C-based DFA matcher over the Perl-based ScanProsite scan tool range from 559.3 × to 15079.7 × on a 40-core MTL node. We show the scalability of our approach for input-sizes of up to 10 GB.

机译：我们提出了并行化确定性有限自动机（DFA）成员资格测试的技术。我们的方法通过使用推测并行匹配多个字节来搜索任意正则表达式。我们将输入字符串划分为多个块，并行匹配块，然后组合匹配结果。我们的并行匹配算法利用DFA的结构属性来最大程度地减少投机开销。与以前的方法不同，我们的猜测是无故障的，即，（1）保留了顺序语义，并且（2）完全避免了速度下降。在具有用于索引存储器负载的SIMD收集操作的体系结构上，我们的匹配操作已完全矢量化。所提出的负载平衡方案使用离线分析步骤来确定每个参与处理器的匹配能力。基于匹配能力，DFA匹配将在不均匀的并行体系结构（例如云计算环境）上进行负载平衡。我们评估了Perf兼容正则表达式（PCRE）库和PROSITE蛋白质数据库中一组代表性基准的推测性DFA成员资格测试。评估是在Intel学术计划Manycore测试实验室（Intel MTL）的4 CPU（40核）共享内存节点，用于8路完全矢量化SIMD执行的Intel AVX2 SDE模拟器以及20节点（ 288个核心）群集在Amazon EC2计算云上。获得的加速比约为O（1 + | P | -1 / | Q |·γ），其中| P |表示处理器或SIMD单元的数量，|问|表示DFA状态数，0 ＜γ≤1表示静态计算的DFA属性。对于所有观察到的情况，我们发现0.02 ＜γ＜0.47。对于40核MTL节点，对于PCRS最多512个DFA状态，实际加速范围为2.3×到38.8×，对于PROSITE最多1,288 DFA状态，实际加速范围为1.3×和19.9×。对于PCRE，EC2计算云的加速范围为5.0倍至65.8倍，对于PROSITE，加速范围为5.0倍至138.5倍。在40核MTL节点上，基于Perl的ScanProsite扫描工具的基于C的DFA匹配器的加速范围为559.3×到15079.7×。对于最大10 GB的输入大小，我们展示了我们方法的可扩展性。

著录项

来源
《International journal of parallel programming》 |2014年第3期|456-489|共34页
作者
Yousun Ko; Minyoung Jung; Yo-Sub Han; Bernd Burgstaller;
展开▼
作者单位

Department of Computer Science, Yonsei University, Seoul, Korea;

Department of Computer Science, Yonsei University, Seoul, Korea;

Department of Computer Science, Yonsei University, Seoul, Korea;

Department of Computer Science, Yonsei University, Seoul, Korea;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
DFA membership test; Parallel pattern matching; Parallel regular expression matching; Speculative parallelization; Multicores;

机译：DFA成员资格测试;并行模式匹配;并行正则表达式匹配;推测并行化;多核;

相似文献

外文文献
中文文献
专利

1. Flexbar 3.0-SIMD and multicore parallelization [J] . Bioinformatics . 2017,第18期

机译：FlexBar 3.0-SIMD和多核并行化
2. Speculative Parallelization of Sequential Loops on Multicores [J] . Chen Tian, Min Feng, Vijay Nagarajan, International journal of parallel programming . 2009,第5期

机译：多核序列环的推测并行化
3. Speculative Parallelization of Sequential Loops on Multicores [J] . Chen Tian, Min Feng, Vijay Nagarajan, International Journal of Parallel Programming . 2009,第5期

机译：多核序列环的推测并行化
4. Large-Scale Software Testing Environment Using Cloud Computing Technology for Dependable Parallel and Distributed Systems [C] . Hanawa Toshihiro, Banzai Takayuki, Koizumi Hitoshi, Third International Conference on Software Testing, Verification, and Validation—Workshops . 2010

机译：使用云计算技术的大型软件测试环境，用于可靠的并行和分布式系统
5. Scalable parallel computing on clouds: Efficient and scalable architectures to perform pleasingly parallel, MapReduce and iterative data intensive computations on cloud environments. [D] . Gunarathne, Thilina. 2014

机译：云上的可伸缩并行计算：高效且可伸缩的架构，可在云环境上执行令人满意的并行，MapReduce和迭代式数据密集型计算。
6. Parallel Digital Watermarking Process on Ultrasound Medical Images in Multicores Environment [O] . Hui Liang Khor, Siau-Chuin Liew, Jasni Mohd. Zain 2016

机译：多核环境中超声医学图像的并行数字水印处理
7. A speculative parallel DFA membership test for multicore, SIMD and cloud computing environments [O] . Yousun Ko, Minyoung Jung, Yo-sub Han, 2016

机译：针对多核，sImD和云计算环境的推测性并行DFa成员资格测试
8. Parallel Sparse Linear System and Eigenvalue Problem Solvers: From Multicore to Petascale Computing. [R] . Sameh, A. H. 2015

机译：并行稀疏线性系统和特征值问题求解器：从多核到千万亿次计算。

A Speculative Parallel DFA Membership Test for Multicore, SIMD and Cloud Computing Environments

摘要

著录项

相似文献

相关主题

期刊订阅