Assigned: | Saturday, November 10. |
Due Date: | Tuesday, November 20. |
Collaboration Policy: | Level 1 (refer to the official policy for details) |
Group Policy: | Pair-optional (you may work in a group of 2 if you wish) |
This lab will help you understand the different types of caches designs and the impact that cache memories can have on the performance of your programs. To do so, you will write a C program simulating the behavior of a cache memory on real-world memory usage traces.
Your cache simulator will simulate an arbitrary cache memory, as defined by the usual three values discussed in class: S (the number of sets), E (the number of lines per set), and B (the size of a data block). Specifically, your program will simulate the behavior of the specified cache on a trace file, which consists of a series of memory accesses that you will replay in simulation. The output of your simulator will be three values: the number of cache hits, the number of cache misses, and the number of block evictions performed.
To help you test your program, you have been provided with a reference implementation, as well as a driver program that will automatically test your simulator by comparing its results against the reference simulator. Details on the trace files and the reference simulator are provided below.
The traces
subdirectory of your lab directory contains a collection of reference trace files that will be used to evaluate the correctness of your cache simulator. These trace files describe a series of memory accesses and are generated by valgrind
. The traces have the following format:
I 0400d7d4,8 M 0421c7f0,4 L 04f6b868,8 S 7ff0005c8,8
Each line denotes one or two memory accesses. The general format of each line is
[space]operation address,size
The operation field denotes the type of memory access: "I" denotes an instruction load, "L" a data load, "S" a data store, and "M" a data modify (i.e., a data load followed by a data store). There is never a space before each "I", while there is always a space before each "M", "L", or "S". The address field specifies a 64-bit hex memory address. The size field specifies the number of bytes accessed by the operation.
In addition to the included traces, you can use valgrind
to generate your own memory traces in this format, like so:
$ valgrind --log-fd=1 --tool=lackey -v --trace-mem=yes ls -l
The above example will run the program ls -l
and dump a trace of its memory accesses to stdout
(i.e., the terminal). To save the output in a file, just redirect to a file by appending something like > ls.trace
to the end of the valgrind
command as we've seen in the past.
You have been provided with the compiled executable of a reference cache simulator called cachesim-ref
to help you in testing your program. Just like the simulator you will write, the reference simulator outputs the total number of hits, misses, and evictions when running a given valgrind
trace through the specified cache.
The reference simulator takes the following command-line arguments:
-h
: Optional help flag that prints usage info-v
: Optional verbose flag that prints trace info-s <s>
: Number of set index bits (the number of bits, not the actual number of sets!)-E <E>
: Associativity (number of lines per set)-b <b>
: Number of block bits (the number of bits, not the actual block size!)-t <tracefile>
: Name of the valgrind
trace to replayFor example, here is an example run of the reference simulator:
$ ./cachesim-ref -s 4 -E 1 -b 4 -t traces/t2.trace hits:4 misses:5 evictions:3
Running the same with the addition of the -v
flag will print information about each memory access in the trace:
$ ./cachesim-ref -v -s 4 -E 1 -b 4 -t traces/t2.trace L 10,1 miss M 20,1 miss hit L 22,1 hit S 18,1 hit L 110,1 miss eviction L 210,1 miss eviction M 12,1 miss eviction hit hits:4 misses:5 evictions:3
Your job is to complete your own cache simulator such that it takes the same command-line arguments as the reference simulator and produces identical output. You should write your simulator in cachesim.c
. Note that this file is almost completely empty -- you'll need to write your simulator from scratch.
In addition to following the reference implementation output format, you must adhere to the following specifications while designing your cache simulator. Follow each of these instructions carefully, as each one of them has the potential to completely change your cache's behavior if ignored.
s
, E
, and b
. This means that you will need to allocate storage for your data structures using malloc
.valgrind
does not put a space in front of "I" as it does for "M", "L", and "S", which may be helpful in parsing the trace.main
function, you must call the existing function printSummary
to output your results. See cache.h
for details.valgrind
traces.Your lab files contained in lab5
consist of the following:
cachesim.c
: Your cache simulator program.cachesim-ref
: The compiled reference simulator.cache.c
and cache.h
: Helper code for the printSummary
function.Makefile
: Included Makefile for building your program.traces/
: Directory of memory trace files for testing.test-cachesim
: Program to automatically test your simulator against the reference simulator.To test your cache simulator against all of the included trace files and output an autograded correctness score, just execute the test-cachesim
program:
$ ./test-cachesim Your simulator Reference simulator Points (s,E,b) Hits Misses Evicts Hits Misses Evicts 3 (1,1,1) 9 8 6 9 8 6 traces/t1.trace 3 (4,2,4) 4 5 2 4 5 2 traces/t2.trace 3 (2,1,4) 2 3 1 2 3 1 traces/t3.trace 3 (2,1,3) 167 71 67 167 71 67 traces/t4.trace 3 (2,2,3) 201 37 29 201 37 29 traces/t4.trace 3 (2,4,3) 212 26 10 212 26 10 traces/t4.trace 3 (5,1,5) 231 7 0 231 7 0 traces/t4.trace 6 (5,1,5) 265189 21775 21743 265189 21775 21743 traces/t5.trace 27 Simulator summary: scored 27 of 27 points
Note that your simulator may be tested on traces not in the set of reference traces, and thus a full score on test-cachesim
does not necessarily mean that your program will receive full correctness marks.
Here are some general tips for working on the cache simulator:
struct
for that purpose.unsigned long long
-- this is because a regular long
is often just a 32-bit value (implementation dependent). You may wish to use a typedef
to avoid repeatedly typing this type. Alternately, you can use the uint64_t
type (which is guaranteed to be an unsigned, 64-bit value).t2.trace
and t3.trace
).-v
argument that enables verbose output, displaying the hits, misses, and evictions that occur as a result of each memory access. You are not required to implement this feature in your cache simulator, but it is a good idea to do so, as it will give you helpful output to use in comparing against the reference simulator.You will want to make use of the C standard library while working on your simulator. Here are some particular functions that you may wish to use (but feel free to expand beyond this list):
fopen
function and closed using the fclose
function.fgets
function.sscanf
, which works like scanf
but reads input from a given string rather than from stdin
(i.e., typing at the terminal window).getopt
function, which will take care of all these details for you. For example, suppose a program supports command-line flags a
, b
, and c
, where flag b
takes a numeric argument. Such a program might called as follows:
./myprog -b 25 -cIn this usage, the
b
flag is set and has a value of 25, the c
flag is set (but has no value), and the a
flag is not set. Here is idiomatic code using getopt
that supports parsing these arguments:
int flag_a_set = 0; int flag_b_value = -1; int flag_c_set = 0; char c; while ((c = getopt(argc, argv, "ab:c")) != -1) { // : indicates a flag with an argument switch (c) { case 'a': flag_a_set = 1; break; case 'b': // optarg is a global variable set by getopt flag_b_value = atoi(optarg); // atoi converts a string to an int break; case 'c': flag_c_set = 1; break; default: // we got an unexpected flag (not a, b, or c) printUsageMsg(argv); // print a message saying what args are expected exit(1); } }Note that you'll need to include a few header lines to use
getopt
:
#include <getopt.h> #include <stdlib.h> #include <unistd.h>
As usual, you can download the starter files by running svn update
in your lab repository. You are responsible for completing the contents of cachesim.c
, but should not modify any other file (e.g., cache.c
or cache.h
).
Your final submission will consist of your committed cachesim.c
file at the time of the due date.
Your simulator will be graded on program correctness, design, and style. For full credit, your program must compile without any warnings on the class server.
You can (and should) consult the Coding Design & Style Guide for tips on design and style issues. Please ask if you have any questions about what constitutes good program design and/or style that are not covered by the guide.