| Assigned: | Saturday, November 10. | 
| Due Date: | Tuesday, November 20. | 
| Collaboration Policy: | Level 1 (refer to the official policy for details) | 
| Group Policy: | Pair-optional (you may work in a group of 2 if you wish) | 
This lab will help you understand the different types of caches designs and the impact that cache memories can have on the performance of your programs. To do so, you will write a C program simulating the behavior of a cache memory on real-world memory usage traces.
Your cache simulator will simulate an arbitrary cache memory, as defined by the usual three values discussed in class: S (the number of sets), E (the number of lines per set), and B (the size of a data block). Specifically, your program will simulate the behavior of the specified cache on a trace file, which consists of a series of memory accesses that you will replay in simulation. The output of your simulator will be three values: the number of cache hits, the number of cache misses, and the number of block evictions performed.
To help you test your program, you have been provided with a reference implementation, as well as a driver program that will automatically test your simulator by comparing its results against the reference simulator. Details on the trace files and the reference simulator are provided below.
The traces subdirectory of your lab directory contains a collection of reference trace files that will be used to evaluate the correctness of your cache simulator. These trace files describe a series of memory accesses and are generated by valgrind.  The traces have the following format:
I 0400d7d4,8 M 0421c7f0,4 L 04f6b868,8 S 7ff0005c8,8
Each line denotes one or two memory accesses. The general format of each line is
[space]operation address,size
The operation field denotes the type of memory access: "I" denotes an instruction load, "L" a data load, "S" a data store, and "M" a data modify (i.e., a data load followed by a data store). There is never a space before each "I", while there is always a space before each "M", "L", or "S". The address field specifies a 64-bit hex memory address. The size field specifies the number of bytes accessed by the operation.
In addition to the included traces, you can use valgrind to generate your own memory traces in this format, like so:
$ valgrind --log-fd=1 --tool=lackey -v --trace-mem=yes ls -l
The above example will run the program ls -l and dump a trace of its memory accesses to stdout (i.e., the terminal). To save the output in a file, just redirect to a file by appending something like > ls.trace to the end of the valgrind command as we've seen in the past.
You have been provided with the compiled executable of a reference cache simulator called cachesim-ref to help you in testing your program. Just like the simulator you will write, the reference simulator outputs the total number of hits, misses, and evictions when running a given valgrind trace through the specified cache.
The reference simulator takes the following command-line arguments:
-h: Optional help flag that prints usage info-v: Optional verbose flag that prints trace info-s <s>: Number of set index bits (the number of bits, not the actual number of sets!)-E <E>: Associativity (number of lines per set)-b <b>: Number of block bits (the number of bits, not the actual block size!)-t <tracefile>: Name of the valgrind trace to replayFor example, here is an example run of the reference simulator:
$ ./cachesim-ref -s 4 -E 1 -b 4 -t traces/t2.trace hits:4 misses:5 evictions:3
Running the same with the addition of the -v flag will print information about each memory access in the trace:
$ ./cachesim-ref -v -s 4 -E 1 -b 4 -t traces/t2.trace L 10,1 miss M 20,1 miss hit L 22,1 hit S 18,1 hit L 110,1 miss eviction L 210,1 miss eviction M 12,1 miss eviction hit hits:4 misses:5 evictions:3
Your job is to complete your own cache simulator such that it takes the same command-line arguments as the reference simulator and produces identical output. You should write your simulator in cachesim.c.  Note that this file is almost completely empty -- you'll need to write your simulator from scratch.
In addition to following the reference implementation output format, you must adhere to the following specifications while designing your cache simulator. Follow each of these instructions carefully, as each one of them has the potential to completely change your cache's behavior if ignored.
s, E, and b. This means that you will need to allocate storage for your data structures using malloc.valgrind does not put a space in front of "I" as it does for "M", "L", and "S", which may be helpful in parsing the trace.main function, you must call the existing function printSummary to output your results. See cache.h for details.valgrind traces.Your lab files contained in lab5 consist of the following:
cachesim.c: Your cache simulator program.cachesim-ref: The compiled reference simulator.cache.c and cache.h: Helper code for the printSummary function.Makefile: Included Makefile for building your program.traces/: Directory of memory trace files for testing.test-cachesim: Program to automatically test your simulator against the reference simulator.To test your cache simulator against all of the included trace files and output an autograded correctness score, just execute the test-cachesim program:
$ ./test-cachesim 
                        Your simulator     Reference simulator
Points (s,E,b)    Hits  Misses  Evicts    Hits  Misses  Evicts
     3 (1,1,1)       9       8       6       9       8       6  traces/t1.trace
     3 (4,2,4)       4       5       2       4       5       2  traces/t2.trace
     3 (2,1,4)       2       3       1       2       3       1  traces/t3.trace
     3 (2,1,3)     167      71      67     167      71      67  traces/t4.trace
     3 (2,2,3)     201      37      29     201      37      29  traces/t4.trace
     3 (2,4,3)     212      26      10     212      26      10  traces/t4.trace
     3 (5,1,5)     231       7       0     231       7       0  traces/t4.trace
     6 (5,1,5)  265189   21775   21743  265189   21775   21743  traces/t5.trace
    27
Simulator summary: scored 27 of 27 points
Note that your simulator may be tested on traces not in the set of reference traces, and thus a full score on test-cachesim does not necessarily mean that your program will receive full correctness marks.
Here are some general tips for working on the cache simulator:
struct for that purpose.unsigned long long -- this is because a regular long is often just a 32-bit value (implementation dependent). You may wish to use a typedef to avoid repeatedly typing this type. Alternately, you can use the uint64_t type (which is guaranteed to be an unsigned, 64-bit value).t2.trace and t3.trace).-v argument that enables verbose output, displaying the hits, misses, and evictions that occur as a result of each memory access.  You are not required to implement this feature in your cache simulator, but it is a good idea to do so, as it will give you helpful output to use in comparing against the reference simulator.You will want to make use of the C standard library while working on your simulator. Here are some particular functions that you may wish to use (but feel free to expand beyond this list):
fopen function and closed using the fclose function.fgets function.sscanf, which works like scanf but reads input from a given string rather than from stdin (i.e., typing at the terminal window).getopt function, which will take care of all these details for you. For example, suppose a program supports command-line flags a, b, and c, where flag b takes a numeric argument. Such a program might called as follows:
./myprog -b 25 -cIn this usage, the
b flag is set and has a value of 25, the c flag is set (but has no value), and the a flag is not set. Here is idiomatic code using getopt that supports parsing these arguments:
int flag_a_set = 0;
int flag_b_value = -1;
int flag_c_set = 0;
char c;
while ((c = getopt(argc, argv, "ab:c")) != -1) { // : indicates a flag with an argument
    switch (c) {
    case 'a':
        flag_a_set = 1;
        break;
    case 'b':
        // optarg is a global variable set by getopt
        flag_b_value = atoi(optarg); // atoi converts a string to an int
        break;
    case 'c':
        flag_c_set = 1;
        break;
    default:
        // we got an unexpected flag (not a, b, or c)
        printUsageMsg(argv); // print a message saying what args are expected
        exit(1);
    }
}
Note that you'll need to include a few header lines to use getopt:
#include <getopt.h> #include <stdlib.h> #include <unistd.h>
As usual, you can download the starter files by running svn update in your lab repository. You are responsible for completing the contents of cachesim.c, but should not modify any other file (e.g., cache.c or cache.h).
Your final submission will consist of your committed cachesim.c file at the time of the due date.
Your simulator will be graded on program correctness, design, and style. For full credit, your program must compile without any warnings on the class server.
You can (and should) consult the Coding Design & Style Guide for tips on design and style issues. Please ask if you have any questions about what constitutes good program design and/or style that are not covered by the guide.