Let A be a matrix of n elements, and let's assume there are sqrt n rows and sqrt n columns. Matrices have many applications, as you know, and you can think of it as representing the adjacency matrix of a graph; the elevations in a grid terrain; or a matrix used in numerical simulations in physics. The point is, matrices appear in numerous applications, and we want to be able to implement the fundamental operations on matrices efficiently.
In this project you will implement matrix transposition and matrix multiplication, and you will experimentally analyze the performance of your algorithms as a function of n (matrix size).
Matrix transposition: write a function that computes the transpose of A, call it B. Implement two approaches: the straightforward one, and the recursive one described in class.
Matric multiplication: write a function that computes the product of two matrices of the same size, A and B. Implement two approaches: the straightforward one, and the recursive one described in class.
To store the matrix A, we'll use a 1-dimensional array, call it Z[0..n-1]. To access an element (i,j) from the matrix we'll define a function that maps pairs (i,j) to indices z in {0,1,...,n-1}. This function is bijective, that is, every pair (i,j) maps to a unique z(i,j) and the other way around.
For simplicity, let's assume that sqrt n=2^k, for some k in N. Let bin(x) denote binary representation of number x, with x in {0,1,..., sqrt n -1}. Note that the binary representation of an index of a row or column in the matrix requires k digits, and the binary representation of an index in Z requires 2k digits.
Some mappings:
To initialize the matrices with values, use random numbers.
For experiments use the grid, and run experiments only on the 1g nodes.
There are 3 algorithms, and 3 matrix layouts, therefore you'll have 9 different modules to test.
Plots: For each algorithm, show the running times for each of the 3 layouts. Experiment with both small and large values for n, and show this on different plots. Therefore there will be 6 plots: For each algorithm, you'll have 2 plots, one for small values of n so that we can see the effect of the caches, and one for large n, so that we can see the IO bottleneck.
Hand in: Email me the code so that I can test it. Bring to class a hardcopy of the code, and a paper summarizing your work. The paper is a very important part of your work, so plan to spend on it a fair amount of time.
The paper should include:
Type your paper in Latex. You can use the following template.