27-1 Implementing parallel loops using nested parallelism
Consider the following multithreaded algorithm for performing pairwise addition on \(n\)-element arrays \(A[1..n]\) and \(B[1..n]\), storing the sums in \(C[1..n]\):
C++ 1 2 3SUM-ARRAYS(A, B, C) parallel for i = 1 to A.length C[i] = A[i] + B[i]
a. Rewrite the parallel loop in \(\text{SUM-ARRAYS}\) using nested parallelism (spawn and sync) in the manner of \(\text{MAT-VEC-MAIN-LOOP}\). Analyze the parallelism of your implementation.
Consider the following alternative implementation of the parallel loop, which contains a value \(grain\text-size\) to be specified:
C++ 1 2 3 4 5 6 7SUM-ARRAYS'(A, B, C) n = A.length grain-size = ? // to be determined r = ceil(n / grain-size) for k = 0 to r - 1 spawn ADD-SUBARRAY(A, B, C, k * grain-size + 1, min((k + 1) * grain-size, n)) sync
C++ 1 2 3ADD-SUBARRAY(A, B, C, i, j) for k = i to j C[k] = A[k] + B[k]
b. Suppose that we set \(grain\text -size = 1\). What is the parallelism of this implementation?
c. Give a formula for the span of \(\text{SUM-ARRAYS}'\) in terms of \(n\) and \(grain\text-size\). Derive the best value for grain-size to maximize parallelism.
a. See the algorithm \(\text{SUM-ARRAYS}(A, B, C)\). The parallelism is \(O(n)\) since it's work is \(n\lg n\) and the span is \(\lg n\).
b. If grainsize is \(1\), this means that each call of \(\text{ADD-SUBARRAY}\) just sums a single pair of numbers. This means that since the for loop on line 4 will run \(n\) times, both the span and work will be \(O(n)\). So, the parallelism is just \(O(1)\).
C++ | |
---|---|
1 2 3 4 5 6 7 |
|
c. Let \(g\) be the grainsize. The runtime of the function that spawns all the other functions is \(\left\lceil \frac{n}{g} \right\rceil\). The runtime of any particular spawned task is \(g\). So, we want to minimize
To do this we pull out our freshman calculus hat and take a derivative, we have
To solve this, we set \(g = \sqrt n\). This minimizes the quantity and makes the span \(O(n / g + g) = O(\sqrt n)\). Resulting in a parallelism of \(O(\sqrt n)\).
本页面的全部内容在 小熊老师 - 莆田青少年编程俱乐部 0594codes.cn 协议之条款下提供,附加条款亦可能应用