The goal of this project is to compute the total viewshed on a grid terrain, parallelize it using OpenMP, and assess the effect of this parallelization using the multicore servers on Bowdoin's HPC grid; You will also explore how using row-major grid layout versus blocked grid layout plays in the overall performance. You will describe your results in a paper.
[ltoma@dover:\~] ./totalview test2.asc test2totvis.ascThis will read elevation grid test2.asc, compute the total viewshed and save it as grid test2totvis.asc. Note: The grids are assumed to be in the ascii format.
The output grid, test2totvis.asc, represents the total viewshed grid of test2.asc.
Use the function to compute the viewshed from the previous project, and modify it so that instead of returning a viewshed grid, it simply returns a count of how many grid points are "1". For example, it could look like this:
/* Compute the viewshed of (vprow,vpcol) and store it in the viewshed grid vg, which is assumed to be allocated prior to this call. After computing the viewshed, count its size and return it */ int compute_viewshed_size (Grid eg, Grid vg, int vprow, int vpcol)Note that the viewshed grid is allocated outside, to save time (see below). Then you'll have a parallel loop that calls this function for all (i,j). It might look something like this:
/* Compute the total viewshed of elevation grid eg, and store it in the grid tvg, which is assumed to be allocated prior to this call. */ void compute_total_viewshed (Grid eg, Grid tvg) { Grid tmpgrid; //we'll use this to store each individual viewshed //allocate tmpgrid for (i=0; i< eg.nrows; i++) { for (j=0; j< eg.ncols; j++) { //reset tmpgrid set (tvg, i, j) = compute_viewshed_size(eg, tmpgrid, i, j); }//for j }//for i }
./totviewshed set1.asc set1totview.asc 1 computing total viewshed on set1.asc.. TOTAL xxxx seconds
Testing for race conditions: Try running with different number of cores. Render the output, and compare it to the output with one core. If it's different, it's bad. You have a race condition. Go back to debugging.
Here is how I would do it: the function that computes the viewsshed and the total viewshed shoudl use the grid getter and setter to access the grid. When they want to read/write an element of a grid at row i and column j, they should use
/* return the element in g at row i and column j */ float get(Grid g, int i, int j) /* set the element in g at row i and column j to value x */ void set(Grid g, int i, int j, float x)Anyone who uses a grid should make no assumption on where element at row i and column j is stored; that is the internal business of the grid, and it's encapsulated in the grid.
At the top of your grid code define a flag that says whether to use row major or blocked. The user can chose between row-major or blocked order by chosing one or the other, and recompiling.
#define ROWMAJOR //#define BLOCKEDThen I would define:
/* return the element in g at row i and column j */ float get(Grid g, int i, int j) { #ifdef ROWMAJOR return get_rowmajor(g, i, j); #else return get_blocked(g, i, j); #endif }
The servers in the grid are not interactive machines --- you cannot interact with them the same as you interact with dover and foxcroft, or with your laptop. The Grid is setup to run batch jobs only (not interactive and/or GUI applications).
The servers on the Grid run the Sun Grid Engine (SGE), which is a software environment that coordinates the resources on the grid. The grid has a headnode which accepts jobs, puts them in a waiting queue until they can be run, sends them to the computational node(s) on the grid to run them, manages them while they run, and notifies the owner when the job is finished. This headnode is a machine called moosehead. To interact with The Grid you need to login to the Grid headnode "moosehead.bowdoin.edu" via an SSH client program.
ssh moosehead.bowdoin.edu
Moosehead is an old server which was configured to run the Sun Grid Engine and do whatever a headnode is supposed to do: moosehead accepts jobs, puts them in a queue until they can be executed, sends them to an execution machine, manages them during execution, and logs the record of their execution when they are finished.
Moosehead runs linux so in principle you can run on it anything that you could run on dover. However DJ (the sysadmin, and Director of Bowdoin's HPC Grid) asks that you don't. Moosehead is an old machine. Use it only to submit jobs to the grid and to interact with the grid. Do the compiling, developing and testing somewhere else (e.g. on dover).
The Grid uses the same shared filespace as all of the Bowdoin Linux machines, so you can access the same home directory and data space as with dover or foxcroft (if you need to transfer files from a machine that is not a part of the Bowdoin network, use scp from your machine to dover or foxcroft first).
To submit to the grid you have two options:
ssh moosehead cd [directory-where-your-code-is-compiled] hpcsub -pe smp 8 -cmd [your-code] [arguments to pass to the program]The arguments -pe smp 8 are optional (but, if you are running OpenMP code, you should use them). They specify that your code is to be run in the SMP environment, with 8 cores (here 8 is only an example, it can be any number you want).
For example, if I want to run hellosmp that we talked about in class (which you can find here) using 8 CPU cores in the SMP environment, I would do:
ssh moosehead [ltoma@moosehead:~]$ pwd /home/ltoma [ltoma@moosehead:~]$ cd public_html/teaching/cs3225-GIS/fall17/Code/OpenMP/ [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ ls example1.c example2.cpp example3.c example4.c hellosmp hellosmp.c hellosmp.h hellosmp.o Makefile [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ hpcsub -pe smp 8 -cmd hellosmp Submitting job using: qsub -pe smp 8 hpc.10866 Your job 236150 ("hpc.10866") has been submitted
The headnode puts this job in the queue and starts looking for 8 cores that are free. When 8 cores become available, it assigns these 8 cores to your job. While your job is running no other job can use the 8 cores that it got assigned---- they are exclusively yours while your job runs. To check the jobs currently in the queue, do:
qstatTo check on all jobs running on the cluster, type
qstat -u "*"For a full listing of all jobs on the cluster, type
qstat -f -u "*"To display list of all jobs belonging to user foo, type
qstat -u fooAfter I submit a job I usually check the queue:
[ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 236150 0.00000 hpc.10866 ltoma qw 10/12/2016 15:53:20 8 [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 236150 0.58278 hpc.10866 ltoma r 10/12/2016 15:53:27 all.q@moose15 8Note how the job initially shows as "qw" (queued and waiting) and then changes to "r" (running).
When the job is done you will get an email. If you list the files, you will notice a new file called "hpc.[job-number].xxx". This file represents the standard output for your job ---- all the print commands are redirected to this file.
[ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ ls example1.c example2.cpp example3.c example4.c hellosmp hellosmp.c hellosmp.h hellosmp.o hpc.10866.o236150 Makefile [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ cat hpc.10866.o236150 I am thread 1. Hello world! I am thread 2. Hello world! I am thread 7. Hello world! I am thread 0. Hello world! I am thread 5. Hello world! I am thread 6. Hello world! I am thread 4. Hello world! I am thread 3. Hello world!
#!/bin/bash #$ -cwd #$ -j y #$ -S /bin/bash #$ -M (my_login_name)@bowdoin.edu -m b -m eTo submit your job to the grid you will do:./hellosmp
ssh moosehead cd [folder-containing-myscript.sh] qsub myscript.shExample:
[ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ cat myscript.sh #!/bin/bash #$ -cwd #$ -j y #$ -S /bin/bash #$ -M ltoma@bowdoin.edu -m b -m e #./hellosmp ./example1 [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qsub myscript.sh Your job 236154 ("myscript.sh") has been submitted [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 236154 0.00000 myscript.s ltoma qw 10/12/2016 16:00:17 1 [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 236154 0.50500 myscript.s ltoma r 10/12/2016 16:00:27 all.q@moose22 1 [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$Note how your job went from "qw" to not in the queue (basically it ran and finished so fast that we could not see it).
Each job creates a file by appending the job number to the script. In our case this is a file called "myscript.sh.o[job-number]". These .o* file will be the equivalent to what you would see on the console if running the program interactively.
[ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ ls example1 example3.c hellosmp hellosmp.o myscript.sh example1.c example4.c hellosmp.c example2.cpp hello hellosmp.h hpc.10866.o236150 Makefile myscript.sh.o236154 [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ cat myscript.sh.o236154 Hello World from thread 0 There are 1 threads [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$
Looking at the output we see that example1 was run with just one thread. That's because when we submitted we did not specify that we wanted SMP and how many threads we wanted, so we got whatever the default is (which is no threads). When running OpenMP code you need to submit using arguments -pe smp [numberthreads]. For example:
[ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qsub -pe smp 8 myscript.sh Your job 236155 ("myscript.sh") has been submitted [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ cat myscript.sh.o23615 Hello World from thread 5 Hello World from thread 1 Hello World from thread 0 There are 8 threads Hello World from thread 6 Hello World from thread 7 Hello World from thread 2 Hello World from thread 4 Hello World from thread 3Ah, that's better.
If you are running an experimental analysi sand youc are about the timings, you want to request that the whole machine is yours, even if your job is only going to use x processors. You can do that by including flag excl=true:
[ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qsub -l excl=true -pe smp 8 myscript.sh Your job 236157 ("myscript.sh") has been submitted
[ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 236157 0.00000 myscript.s ltoma qw 10/12/2016 16:05:15 8 [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 236157 0.60500 myscript.s ltoma r 10/12/2016 16:05:27 all.q@moose22 8 [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$ qstat [ltoma@moosehead:~/public_html/teaching/cs3225-GIS/fall17/Code/OpenMP]$
(Briefly) describe the total viewshed problem, what the project is doing, and why (to support the why, bring in the running time of the total viewshed on one processor).
Here you'll want to say that you use OpenMP which conveniently provides a parallel for loop; the important part is to describe the details on your parallel for loop.
Describe the experiments you ran to assess the effect of parallelization: include the table with the running times and the plot of the speedup.
Datasets: Use set1.asc. It would be great if you also ran experiments for kaweah.asc, but since the running times are larger, it's optional.
For the experiments, include some brief detail on the command you used to submit the jobs that can help us interpret and compare the running times with those of your peers, such as if you used the -excl flag. Also include info on what server ran your job.
The table: the running time of your code on the grid, with number of cores P = 1, 2, 4, 8, 12, 16, 20, 24, 32, 40, and the speedup obtained in each case (speedup is defined as T1/Tk, where T1 is the time to run with P=1 cores and Tk is the time to run with P=k cores.
The plot: plot of the speedup function of the number of cores, for set1.asc
Also include a screenschot of the total viewshed computed by your code on set1.asc (use render2d to render it)
Discuss your findings.
Describe the experiments you have done to assess the effect of the blocked layout, and discuss yoru findings. Describe how you chose the value of the block.