Feluca: A Two-Stage Graph Coloring Algorithm With Color-Centric Paradigm on GPU

Zheng Zhigao; Shi Xuanhua; He Ligang; Jin Hai; Wei Shuo; Dai Hulin; Peng Xuan

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Feluca: A Two-Stage Graph Coloring Algorithm With Color-Centric Paradigm on GPU

【24h】

Feluca: A Two-Stage Graph Coloring Algorithm With Color-Centric Paradigm on GPU

机译：Feluca：一种双级图形着色算法，GPU上以色彩为中心的范式

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

There are great challenges in performing graph coloring on GPU in general. First, the long-tail problem exists in the recursion algorithm because the conflict (i.e., different threads assign the adjacent nodes to the same color) becomes more likely to occur as the number of iterations increases. Second, it is hard to parallelize the sequential spread algorithm because the color allocation depends on the adjoining iteration. Third, the atomic operation is widely used on GPU to maintain the color list, which can greatly reduce the efficiency of GPU threads. In this article, we propose a two-stage high-performance graph coloring algorithm, called Feluca, aiming to address the above challenges. Feluca combines the recursion-based method with the sequential spread-based method. In the first stage, Feluca uses a recursive routine to color a majority of vertices in the graph. Then, it switches to the sequential spread method to color the remaining vertices in order to avoid the conflicts of the recursive algorithm. Moreover, the following techniques are proposed to further improve the graph coloring performance. i) A new method is proposed to eliminate the cycles in the graph; ii) a top-down scheme is developed to avoid the atomic operation originally required for color selection; and iii) a novel color-centric coloring paradigm is designed to improve the degree of parallelism for the sequential spread part. All these newly developed techniques, together with further GPU-specific optimizations such as coalesced memory access, comprise an efficient parallel graph coloring solution in Feluca. We have conducted extensive experiments on NVIDIA GPU. The results show that Feluca can achieve 1.19 - 8.39x speedup over the state-of-the-art algorithms.

机译：在GPU上表演图形着色时存在巨大挑战。首先，由于冲突（即，不同的线程将相邻节点分配到相同颜色的不同线程将相邻节点分配给相同的节点），因此在递归算法中存在长尾问题。其次，由于颜色分配取决于相邻的迭代，因此很难并行化顺序传播算法。第三，原子操作广泛用于GPU以维持颜色列表，这可以大大降低GPU线程的效率。在本文中，我们提出了一种称为Feluca的两级高性能图着色算法，旨在解决上述挑战。 Feluca将基于递归的方法与顺序扩展的方法相结合。在第一阶段，Feluca使用递归例程来为图中的大部分顶点彩色。然后，它切换到顺序扩展方法以彩色剩余顶点以避免递归算法的冲突。此外，提出了以下技术以进一步改善图形着色性能。 i）提出了一种新方法来消除图表中的周期; ii）开发了一种自上而下的方案，以避免最初需要颜色选择所需的原子操作;和iii）设计一种新型的色彩着色范式，旨在提高顺序扩散部分的平行度。所有这些新开发的技术与进一步的GPU特定的优化（如聚结的存储器访问）一起包括在Feluca中有效的平行图形着色溶液。我们对NVIDIA GPU进行了广泛的实验。结果表明，Feluca可以通过最先进的算法实现1.19 - 8.39倍的加速。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2021年第1期|160-173|共14页
作者
Zheng Zhigao; Shi Xuanhua; He Ligang; Jin Hai; Wei Shuo; Dai Hulin; Peng Xuan;
展开▼
作者单位

Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Wuhan 430074 Peoples R China;

Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Wuhan 430074 Peoples R China;

Univ Warwick Dept Comp Sci Coventry CV4 7AL W Midlands England;

Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Wuhan 430074 Peoples R China;

Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Wuhan 430074 Peoples R China;

Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Wuhan 430074 Peoples R China;

Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Serv Comp Technol & Syst Lab Wuhan 430074 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Color; Graphics processing units; Image color analysis; Task analysis; Parallel processing; Computational modeling; Synchronization; Graph coloring; GPGPU; parallelism; color-centric paradigm; pipeline;

机译：颜色;图形处理单元;图像颜色分析;任务分析;并行处理;计算建模;同步;图形着色;GPGPU;PPGGPU;PATRINGIC;管道;

相似文献

外文文献
中文文献
专利

1. Efficient and high-quality sparse graph coloring on GPUs [J] . Xuhao Chen, Pingfan Li, Jianbin Fang, Concurrency and computation: practice and experience . 2017,第10期

机译：在GPU上进行高效，高质量的稀疏图形着色
2. Evaluating Graph Coloring on GPUs [J] . A.V. Pascal Grosset, Peihong Zhu, Shusen Liu, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2011,第8期

机译：在GPU上评估图形着色
3. Two-Stage Least Squares Algorithms with QR Decomposition for Simultaneous Equations Models on Heterogeneous Multicore and Multi-GPU Systems [J] . Carla Ramiro, Jose J. López-Espín, Domingo Giménez, Procedia Computer Science . 2012,第1期

机译：异构多核和多GPU系统上联立方程模型的QR分解两阶段最小二乘算法
4. Distributed Memory Graph Coloring Algorithms for Multiple GPUs [C] . Ian Bogle, Erik G. Boman, Karen Devine, Workshop on Irregular Applications: Architectures and Algorithms;International Conference for High Performance Computing, Networking, Storage and Analysis . 2020

机译：多个GPU的分布式内存图着色算法
5. Efficient algorithms for graph coloring: Vertex, edge, list, total, and acyclic coloring. [D] . Skulrattanakulchai, San. 2002

机译：图形着色的高效算法：顶点，边，列表，总计和非循环着色。
6. Graphics Processing Unit (GPU) implementation of image processing algorithms to improve system performance of the Control Acquisition Processing and Image Display System (CAPIDS) of the Micro-Angiographic Fluoroscope (MAF) [O] . S.N. Swetadri Vasan, Ciprian N. Ionita, A.H. Titus, -1

机译：图形处理单元（GpU）执行的图像处理算法以改善控制采集处理的系统的性能以及微造影荧光镜的图像显示系统（CapIDs）（maF）
7. Evaluating Graph Coloring on GPUs [O] . 2014

机译：在GPU上评估图形着色

Feluca: A Two-Stage Graph Coloring Algorithm With Color-Centric Paradigm on GPU

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅