CS 220 - Lab 7: Memory Management - the Stack and the Cache
Due 5:00PM November 24, 2003
General Lab Goals
This lab explores the details of the runtime stack, especially its use
to implement JVM method call and parameter passing. We also explore the
use of cache memory for performance improvement at the
microarchitecture level.
Part 1 - The Runtime Stack, Method Call, and Parameters
The runtime stack contains a "stack frame" for each active method, as
discussed on pages 218-227 of your text. This stack frame is created by
the calling method and is used to pass parameters and store local
variables.
The stack frame also contains space for the value being returned to the
calling method.
The calling method creates a stack frame in the following way.
First, it pushes the OBJREF word onto the stack, followed by the
parameter values for this particular call. Second, it issues an
INVOKEVIRTUAL instruction, which allocates space in the stack frame for
the local variables and the
current values of LV and PC. This instruction also adjusts the pointers
LV and SP so that they point to the top and bottom of the newly-created
stack
frame. Figure 4-12 on page 224 provides a very good illustration of
this
activity.
Suppose we want to write an IJVM assembly language program that
calls a method, say MAX, that determines the maximum of two integer
values and returns it. In Java, that program would look like this:
public class maxCalc {
public static void main (String[] args) {
int x, y;
System.out.println("Enter two integers: ");
x = Keyboard.readInt(); y = Keyboard.readInt();
System.out.println("Their maximum is " + MAX(x, y));
}
public static int MAX (int m, int n) {
int result;
if (m > n)
result = m;
else result = n;
return result;
}
}
To implement the call in an IJVM main program, we need first to push
the OBJREF word and then the values of x and y onto the stack. Then we
need to call the method. The entire main method, including the call, is
sketched as follows:
.constant
OBJREF 0x40
.end-constant
.main
.var
x
y
.end-var
// code to read values for x and y from the user
// code to call MAX
LDC_W OBJREF
ILOAD x
ILOAD y
INVOKEVIRTUAL MAX
// code to display the result
.end-main
Note that the code for input and output is missing: we shall return to
this issue below.
Now the method MAX is written in a way that reflects the number of
parameters
it has, and then refers to them in a way that is similar to what is
done
at the Java level. That is, the parameters are listed in parentheses
after
the method name. Local variables, if needed, are declared in the same
way
as they are in the main method. When the result is computed, it is left
on top of the stack and an IRETURN instruction is issued. The entire
method
MAX is given below:
.method MAX(m, n)
.var
result
.end-var
ILOAD n
ILOAD m
ISUB // check if m > n
(that is, n-m < 0)
IFLT nless
mless: // or
m <= n
ILOAD n
GOTO out
nless:
ILOAD m
out: IRETURN // leave result on top of the stack
.end-method
When IRETURN is executed, the effect is summarized on page 225. That
is, the entire stack frame for the called method, in this case MAX, is
removed
from the stack (except for the return value), and the registers SP, LV,
and PC are restored to their values before the call is initiated. Thus,
the next instruction executed is the next one that follows the
INVOKEVIRTUAL in the main (calling) method.
Part 2 - Input/Output Issues
Input in the IJVM is done by reading each individual character from the
keyboard and then converting it to the desired type, such as a binary
integer. Similarly, output in the IJVM is done by converting a binary
integer to
a series of hexadecimal characters, and then displaying them one by one
on
the screen. This makes for some nasty programming.
A good working example of IJVM input/output is given in the program add.jas,
which is in the Mic-1 -> ijvmasm folder. This program reads
the two integers entered in hex at the keyboard, computes their sum,
and displays that sum in the mic1sim output window.
Assemble and run this program using the Mic-1 simulator, to become
familiar
with its functionality.
Now look at the IJVM source program add.jas. Notice that
it
has three major parts:
- A main program
- A method getnum, which retrieves
a series of keystrokes from the user and converts it to a 2s complement
32-bit integer
- A method print, which displays
the hexadecimal value of its parameter (again, a 2s complement integer)
as a series of characters.
Notice that the main program, although it's not as good a read as
"Harry Potter," has a relatively simple structure: three local
variables a, b,
and total; two calls to getnum; and a call to print.
Part 3 - Recursive Methods and the Stack
Consider the factorial function, which can be defined in Java in
either
of two ways:
public static int fact (int n) {
// factorial function
int result = 1;
for (int i=2; i<=n; i++)
result = result * i;
return result;
}
public static int fact (int n) {
// factorial function
if (n<2)
return 1;
else
return n*fact(n-1);
}
The first version is nonrecursive, so when it is called by a main
program,
one and only one frame will be added to the stack for the life of the
call. The second version is recursive, so that there will be
several active
calls during the computation of a single factorial. In this case,
each
active call has its own stack frame, in which the caller is identified
as
just another stack frame for fact.
Consider, for example, the recursive call fact(3), for
which
a stack frame is created. This involves the computation of
3*fact(2), so a new stack frame is created to compute fact(2).
But now the computation of fact(2)=2*fact(1) activates still
another stack frame
for the calculation of fact(1). At this point, there
are three active frames on the stack, each of the first two waiting for
another call to complete before completing its own calculation and
returning its result. Now the calcuation of fact(1) is not
recursive, since it simply returns 1 to its caller. Now that
caller can complete the calculation of 2*fact(1) and return that result
to its caller, which in turn can complete the calculation of
3*fact(2).
The stack frame strategy built into Mic-1 (and most other
contemporary machines) thus supports recursion gracefully. Here
is an encoding of the fact function as a recursive method in
IJVM assembly language:
.method fact(n)
.var
.end-var
ILOAD n
BIPUSH 2
ISUB
// check if n<2 (that is, n-2 < 0)
IFLT nless
LDC_W OBJREF
ILOAD n
BIPUSH 1
ISUB
INVOKEVIRTUAL fact // Compute
fact(n-1).
ILOAD n
// Result is now on top of stack,
IMUL
// so compute n*fact(n-1)
IRETURN
// and leave it on top of stack
nless:
BIPUSH 1
out: IRETURN
// Leave 1 on top of stack
.end-method
Answer the following questions:
- Identify the local variables and parameters
for each of the two Java versions of the fact method.
- Write the nonrecursive version of the
Java fact method in IJVM assembly language.
- In IJVM assembly language, write a call
to either of these two IJVM methods that computes the factorial of 5.
- Why won't this method run using the Mic-1
simulator? What would need to be added to Mic-1 so that it could
run.
- The greatest common divisor of two integers
can be rewritten using the following recursive definition:
gcd(m, n) = m
if m = n
= gcd(m-n, n)
if m > n
= gcd(m, n-m)
if m < n
Implement this gcd
function as an IJVM recursive method.
6. Adapt the add.ijvm program that
you exercised in part 2 so that it computes and displays the
greatest common divisor of two
integers,
entered in hexadecimal on the keyboard. For
example, when your program is run,
the following should appear in the output window:
20
30
GCD = 00000010
That is, the GCD of 32 and 48 is 16.
Part 4 - Cache Memory
Read the section on cache memory on pages 265-270 of your text. Also,
read the attached notes on cache memory ,
which
gives a concrete examples of cache addressing and calculating hit
rates. Now answer the following related questions.
1. A set-associative cache has 64 128-word lines, divided into
4-line sets. The main memory contains 4K lines. How many
bits ar needed for a memory address? How many bits are there in
each of the TAG, SET, and WORD fields of the address for this cache
design?
2. A byte-addressable computer has a small cache which holds 8
32-bit words (lines). When a particular loop is executed one
time, the processor
reads data from the following sequence of 12-bit addresses:
200 204 208 20C 2F4 2F0 200 204 218 21C 24C 2F4
This loop is repeated four (4) times.
a. Show the contents of the cache at the end of each pass through
this loop, assuming that a direct-mapped cache replacement strategy is
used.
What is the hit rate for this strategy?
b. Repeat this exercise, assuming now a 4-way set-associative
cache that uses the LRU replacement strategy. What is the hit
rate in this case?
3. Answer questions 26 and 28 on page 301 of your text.
Hand In:
Submit a hard copy of your answers to questions in Parts 3 and 4, and
drag an electronic copy of your program for the last question in Part 3
to the Drop Box in the CS220 (Tucker) folder.