In this paper we present a parallel implementation of the Bowyer-Watson (BW) algorithm using the task-parallel programming model. The BW algorithm constitutes an ideal mesh refinement strategy for implementing a large class of unstructured mesh generation techniques on both sequential and parallel computers, by preventing the need for global mesh refinement. Its implementation on distributed memory multicomputes using the traditional data-parallel model has been proven very inefficient due to excessive synchronization needed among processors. In this paper we demonstrate that with the task-parallel model we can tolerate synchronization costs inherent to data-parallel methods by exploring concurrency in the processor level. Our preliminary performance data indicate that the task-parallel approach: (ⅰ) is almost four times faster than the existing data-parallel methods, (ⅱ) scales linearly, and (ⅲ) introduces minimum overheads compared to the "best" sequential implementation of the BW algorithm.
展开▼