Project 1: Implementation and experimental evaluation for a suite of matrix multiplication algorithms
Implement the three algorithms for matrix multiplication
- standard
- blocked
- divide-and-conquer
For the the standard algorithm and the divide-and-conquer algorithm:
Run a suite of experiments with various values of n, and plot the
running time of the matrix multiplication function of n.
For the blocked algorithm: First run a suite of experiments for a
large value of n, with different values of the block size r. Plot the
running time for that value of n, as function of r. Pick a value of n
that's large enough so that a row of the matrix does not fit in cache
(if possible). The optimal value of r will be the one that optimizes
the running time. Experiment with a few different values of n, to see
if the optimal block size changes. Second, once you found the optimal
block size for that platform, run a suite of experiments keeping r
fixed, and with various values of n. Plot the running time function of
n.
Structure/tips:
- Write each matrix multiplication algorithm in its own separate file (like mmult.c, blocked_mmult.c, rec_mmult.c). Create one Makefile that compiles all three modules.
- All modules should read the value of n on the command line.
- Should work for any value of n, not only powers of 2 <----- added 9/17
- All modules should start by allocating and randomly initializing
the matrix. This should not be timed.
- The matrices should be stored as arrays of doubles.
- Run all experiments on the same platform ---- your laptop or a desktop in on e of the labs (all machines in the lab are the same platform).
- Makefile should compile with -O3 flag.
- Include unit testing <--- added 9/17 (see below)
Testing
Testing is a fundamental part of good programming, so for all your projects you will need to write test functions. Having unit tests will speed up the debugging, because they will help localize the bug.
For matrix multiplication in particular, testing is fairly straightforward: you'll need to check that the output of your blocked and recursive algorithms are correct, by comparing them with the output of the straightforward MM.
So this could look like this:
//file that implements the blocked_mmult()
main(..) {
//a,b are the matrices, c is the result
a = calloc(...)
b = calloc(...)
c = calloc(...)
//initialize a,b with random values
...
//call the blocked mmult
//start timer
blocked_mmult(a,b,c,n)
//end timer
//test it; note, this is not timed
test(a,b,c, n)
}
//input: a,b are matrices and c=a*b computed by blocked_mmult
//this function computed a*b by straightforward matrix multiplication and tests whether c is the same
//if it finds an error, it prints something useful and exits
//if all good, it prints that the test was passed
void test(a,b,c,n) {
//compute d = a*b with straightforward mmult
d = calloc(...)
for i
for j
for k
//now compare c with d element by element; if find an element where they're different,
//print an error message and exit()
//if you got to the end of the loop, all elements match
printf("congratulations: test past\n"):
}
Comments
The programming part for this project is pretty straightforward. The
core of the project is running experiments, so plan to spend most time
on that. You will need to record running times, and plot. You may want
to look into writing scripts to run the experiments. If you are
interesting ask me, and I'll post an example of script (I'll probably
post it anyways).
Finally, doing this via GitHub means I have to specify a deadline,
and I have no control over what happens if you push after the
deadline. So this means you need to follow the deadline. Guildelines
for this assignment (and any assignment). Start early and plan
accordingly.
What to turn in
in GitHub:
- three source files and a Makefile
- 3 plots, once for each algorithm with different values of n
- one additional plot showing the running time function of block size r
- other supporting files and data
in class:
- the plots (hardcopy)
- A cover page containing
- your name
- your username in GitHub
- the http of the
repository, and
- whether you worked alone or with a partner (if you worked with a partner please specify whose repository includes the team's work).
- the L1 cache size of the machine on which you ran the experiments
Comments, in retrospect (for next time)
- Each team needs to email me the Github address of their repository (searching is slow), preferably on a sheet of paper (so that i dont have to search my email)
- Make it clear that they have to handle all values of n, not only power of 2
- Want the three algorithms on same plot so that we can compare