File tree Expand file tree Collapse file tree 2 files changed +61
-0
lines changed
Expand file tree Collapse file tree 2 files changed +61
-0
lines changed Original file line number Diff line number Diff line change @@ -6,3 +6,53 @@ communication, so can easily be overloaded.
66
77The application is installed as a python module with a shell
88script wrapper. The only requirement is MPI4PY.
9+
10+ ## Background
11+
12+ Amdahl's law posits that some unit of work comprises a proportion * p* that
13+ benefits from parallel resources, and a proportion * s* that is constrained to
14+ execute in serial. The theoretical maximum speedup achievable for such a
15+ workload is
16+
17+ ``` output
18+ 1
19+ S = -------
20+ s + p/N
21+ ```
22+
23+ where * S* is the speedup relative to performing all of the work in serial and
24+ * N* is the number of parallel workers. A plot of * S* vs. * N* ought to look like
25+ this, for * p* =0.8:
26+
27+ ``` output
28+ 5┬─────────────────────────────────────·──────────────────┐
29+ │ · │
30+ │ · │
31+ │ · │
32+ 4┤ · │
33+ │ · │
34+ S │ · *
35+ p │ · * * │
36+ e │ · * │
37+ e 3┤ · * │
38+ d │ · * │
39+ u │ · * │
40+ p │ · │
41+ │ ·* |
42+ 2┤ · │
43+ │ * · │
44+ │ · │
45+ │ · │
46+ │ · │
47+ 1*─────┬──────┬─────┬─────┬──────┬─────┬─────┬──────┬─────┤
48+ 1 2 3 4 5 6 7 8 9 10
49+ Workers
50+ ```
51+
52+ "Ideal scaling" (* p* =1) is would be the line * y* = * x* (or * S* = * N* ),
53+ represented here by the dotted line.
54+
55+ This graph shows there is a speed limit for every workload, and diminishing
56+ returns on throwing more parallel processors at a problem. It is worth running
57+ a "scaling study" to assess how far away that speed limit might be for the
58+ given task.
Original file line number Diff line number Diff line change 44
55from mpi4py import MPI
66
7+ """
8+ Gather timing data in order to plot speedup *S* vs. number of cores *N*,
9+ which should follow Amdahl's Law:
10+
11+ 1
12+ S = -------
13+ s + p/N
14+
15+ where *s* is the serial proportion of the total work and *p* the
16+ parallelizable proportion.
17+ """
718
819def do_work (work_time = 30 , parallel_proportion = 0.8 , comm = MPI .COMM_WORLD ):
920 # How many MPI ranks (cores) are we?
You can’t perform that action at this time.
0 commit comments