首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Feluca: A Two-Stage Graph Coloring Algorithm With Color-Centric Paradigm on GPU
【24h】

Feluca: A Two-Stage Graph Coloring Algorithm With Color-Centric Paradigm on GPU

机译:Feluca:一种双级图形着色算法,GPU上以色彩为中心的范式

获取原文
获取原文并翻译 | 示例

摘要

There are great challenges in performing graph coloring on GPU in general. First, the long-tail problem exists in the recursion algorithm because the conflict (i.e., different threads assign the adjacent nodes to the same color) becomes more likely to occur as the number of iterations increases. Second, it is hard to parallelize the sequential spread algorithm because the color allocation depends on the adjoining iteration. Third, the atomic operation is widely used on GPU to maintain the color list, which can greatly reduce the efficiency of GPU threads. In this article, we propose a two-stage high-performance graph coloring algorithm, called Feluca, aiming to address the above challenges. Feluca combines the recursion-based method with the sequential spread-based method. In the first stage, Feluca uses a recursive routine to color a majority of vertices in the graph. Then, it switches to the sequential spread method to color the remaining vertices in order to avoid the conflicts of the recursive algorithm. Moreover, the following techniques are proposed to further improve the graph coloring performance. i) A new method is proposed to eliminate the cycles in the graph; ii) a top-down scheme is developed to avoid the atomic operation originally required for color selection; and iii) a novel color-centric coloring paradigm is designed to improve the degree of parallelism for the sequential spread part. All these newly developed techniques, together with further GPU-specific optimizations such as coalesced memory access, comprise an efficient parallel graph coloring solution in Feluca. We have conducted extensive experiments on NVIDIA GPU. The results show that Feluca can achieve 1.19 - 8.39x speedup over the state-of-the-art algorithms.
机译:在GPU上表演图形着色时存在巨大挑战。首先,由于冲突(即,不同的线程将相邻节点分配到相同颜色的不同线程将相邻节点分配给相同的节点),因此在递归算法中存在长尾问题。其次,由于颜色分配取决于相邻的迭代,因此很难并行化顺序传播算法。第三,原子操作广泛用于GPU以维持颜色列表,这可以大大降低GPU线程的效率。在本文中,我们提出了一种称为Feluca的两级高性能图着色算法,旨在解决上述挑战。 Feluca将基于递归的方法与顺序扩展的方法相结合。在第一阶段,Feluca使用递归例程来为图中的大部分顶点彩色。然后,它切换到顺序扩展方法以彩色剩余顶点以避免递归算法的冲突。此外,提出了以下技术以进一步改善图形着色性能。 i)提出了一种新方法来消除图表中的周期; ii)开发了一种自上而下的方案,以避免最初需要颜色选择所需的原子操作;和iii)设计一种新型的色彩着色范式,旨在提高顺序扩散部分的平行度。所有这些新开发的技术与进一步的GPU特定的优化(如聚结的存储器访问)一起包括在Feluca中有效的平行图形着色溶液。我们对NVIDIA GPU进行了广泛的实验。结果表明,Feluca可以通过最先进的算法实现1.19 - 8.39倍的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号