首页> 外文会议>2017 IEEE 24th International Conference on High Performance Computing >Parallel Asynchronous Distributed-Memory Maximal Independent Set Algorithm with Work Ordering
【24h】

Parallel Asynchronous Distributed-Memory Maximal Independent Set Algorithm with Work Ordering

机译:具有工作排序的并行异步分布式内存最大独立集算法

获取原文
获取原文并翻译 | 示例

摘要

The maximal independent set (MIS) graph problem arises in many applications such as computer vision, information theory, molecular biology, and process scheduling. The growing scale of graph data suggests the use of distributed memory hardware as a cost-effective approach to providing necessary compute and memory resources. Existing distributed memory parallel MIS algorithms rely on synchronous communication and use techniques such as subgraph computations. In this paper, we present an asynchronous distributed-memory parallel graph algorithm that relies on a virtual directed acyclic graph (DAG) that is created during the algorithm execution. We introduce two additional algorithms that save computations by ordering generated work. The first algorithm applies ordering globally to reduce computations, and the second algorithm applies ordering locally at the level of threads to minimize the synchronization overhead. We use two different implementations of Luby's algorithm variants as baseline to compare the performance of the presented algorithms: (1) vertex-centric Luby A and Luby B implementations, and (2) the CombBLAS linear-algebra Luby A implementation. Results show that proposed algorithms outperform both implementations of Luby algorithms, especially in distributed execution. Furthermore, we show that for low- diameter graphs the algorithm that applies global ordering scales better than other algorithms and for high diameter graphs the original asynchronous algorithm and thread-level ordering algorithm show better performance.
机译:最大独立集(MIS)图问题出现在许多应用中,例如计算机视觉,信息论,分子生物学和过程调度。图形数据的规模不断扩大,表明使用分布式内存硬件作为提供必要的计算和内存资源的经济有效的方法。现有的分布式内存并行MIS算法依赖于同步通信并使用诸如子图计算之类的技术。在本文中,我们提出了一种异步分布式内存并行图算法,该算法依赖于在算法执行期间创建的虚拟有向无环图(DAG)。我们介绍了另外两种通过排序生成的工作来节省计算量的算法。第一种算法全局应用排序以减少计算,第二种算法在线程级别局部应用排序以最小化同步开销。我们使用Luby算法变体的两种不同实现方式作为基准来比较所提出算法的性能:(1)以顶点为中心的Luby A和Luby B实现方式,以及(2)CombBLAS线性代数Luby A实现方式。结果表明,所提出的算法优于两种Luby算法的实现,尤其是在分布式执行中。此外,我们表明,对于低直径图,应用全局排序比例的算法比其他算法更好;对于高直径图,原始的异步算法和线程级排序算法表现出更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号