CS 220 - Lab 5 Handout

CS220 - Design of a Complete CPU and Memory System
October 10, 2003

This document summarizes what will serve as a basic design of a complete CPU and memory system that we can aim toward implementing in LogicWorks. The architecture shown on page 205, together with the ALU and Memory units you designed in earlier labs, provide the basis for this design.

All registers shown here are 8-bit (1-byte) registers. Registers PC and MBR contain the address and value of the instruction currently being executed, while registers MAR and MDR contain the address and value of an operand of the instruction currently being executed. Register H is an intermediate register that is useful for various arithmetic operations.

This design has four buses. The B and C buses connect the registers to the ALU. In each clock cycle, a register is connected to the ALU through the B bus, an operation is performed between it and the contents of the H register, and the result is connected to one or more of the registers through the C bus.

The A and D buses connect the CPU to memory. The A bus contains the address of a memory byte to be read or written, and the D bus contains the value to be read or written. Whether or nt

The memory is a 256-byte memory and is byte-addressible, so that a memory address is 8 bits and the range of memory addresses (the "address space") is 0-255.

The memory is divided into two conceptual parts, the "code" and the "stack." The code is the IJVM bytecode program itself, and it occupies a block of memory locations beginning at address 0. The stack contains the local data for the program and occupies a block of memory locations beginning at address 128. The bottom value on the stack is at address 128, while the top value on the stack is at the address in register SP. A copy of that top value is always maintained in register TOS.

Instruction Set

A machine instruction is either 1 byte long or 2 bytes long, depending on whether or not it either addresses memory or has an immediate operand. If it is an add, subtract, and, or, pop, dup, swap, return, or nop instruction it is 1 byte long, as shown below:

Here, Op identifies the 8-bit operation code for one of these instructions. When writing an assembly language instruction like

IADD

The interpretation of IADD is that the top two bytes on the stack are popped and added, and this sum is pushed on top of the stack in their place. All numeric data in this machine is stored as 8-bit 2's complement integers. Here is a summary of the 1-byte instructions and their meaning.

		  Meaning (Clock cycles):
Symbol   Op(Hex)  T1		T2    	    T3		     T4
------   ----     -----------   ---------   -------------    -------
IADD     60       MAR=SP=SP-1   H=TOS; rd   MDR=TOS=MDR+H    wr; End
ISUB     64       MAR=SP=SP-1   H=TOS; rd   MDR=TOS=MDR-H    wr; End
IAND     7E       MAR=SP=SP-1   H=TOS; rd   MDR=TOS=MDR&&H   wr; End
IOR      80       MAR=SP=SP-1   H=TOS; rd   MDR=TOS=MDR||H   wr; End
POP      57       MAR=SP=SP-1   rd          TOS=MDR; End
IRETURN  AC       Pwroff

The notation x = y means "assign the value of y to x" (as in Java or C), the notation "x; y" means "do x and y in the same clock cycle," and the notation "x, y" means do x and then y in two separate clock cycles." Also, "rd" means "do a memory read" and "wr" means "do a memory write." Finally, the notation "x=y=z" is like an ordinary assignment, evaluated from right to left.

For example, execution of IADD takes place in 4 clock cycles (decrementing the SP, transferring the second operand into H from TOS and reading the first operand from memory, adding the two operands and storing the result into TOS, and writing this result onto the stack.

If the instruction is a stack push, goto, if, increment, load, or store instruction, it requires two bytes in memory, one for the op code and the other for the memory address or value of the operand. These instructions have the following layout:

Here is a summary of the 2-byte instructions and their meanings.

	  Meaning (Clock cycles):
Symbol Op T1	         T2          T3          T4              T5          T6
---------------------------------------------------------------------------------------
ILOAD  15 PC=PC+1; fetch MAR=MBR     rd          TOS=MDR         MAR=SP=SP+1 wr; End
BIPUSH 10 PC=PC+1; fetch MDR=TOS=MBR MAR=SP=SP+1 wr; End           
ISTORE 36 PC=PC+1; fetch MAR=MBR     MDR=TOS     MAR=SP=SP-1; wr rd          TOS=MDR; End
GOTO   A7 PC=PC+1; fetch PC=MBR
IFEQ   99 PC=PC+1; fetch TOS = TOS   Z->PC=MBR-1 MAR=SP=SP-1     rd          TOS=MDR; End  
IFLT   9B PC=PC+1; fetch TOS = TOS,  N->PC=MBR-1 MAR=SP=SP-1     rd          TOS=MDR; End
IINC   84 PC=PC+1; fetch MAR=MBR     rd          MDR=MDR+1       wr; End

At the beginning of execution for each of these 13 instruction types, a single clock cycle, T0, fetches the instruction and then branches to one of the above sequences. It's definition is:

T0
-------------------------
PC=PC+1; fetch; goto (OP)

T0 is also the clock cycle to which each of the above sequences returns when the End switch is set. The fetch-execute cycle for one instruction takes from 1 to 6 clock cycles, depending on the instruction type.

To illustrate the idea of programming at this level of language, here are some simple examples.

The Java statement i=i+j, where i is an int variable, can be broken down into a series of IJVM instructions. Since an arithmetic operation itself cannot be performed unless both operands are on the stack, we need the following sequence:

ILOAD i ILOAD j IADD ISTORE iSimilarly, the Java instruction if (i>=j) k=k-1; can be coded in the following sequence of instructions:

ILOAD i ILOAD j ISUB IFLT skip ; if i<j we can skip the assignment k=k-1 BIPUSH 1 ; otherwise, we do the assignment ILOAD k ISUB ISTORE k skip:

Here is a more interesting example, which illustrates the process of breaking a program down into a sequence of more elementary steps that can be handled by our new machine architecture. Consider the following Java code that computes the factorial f of an integer n.

f = 1;
for (int i=2; i<=n; i++)
  f = f * i;

The IJVM has no multiplication (IMUL) instruction, so we need to rewrite our loop multiplication:

f = 1;
for (int i=2; i<=n; i++) {
  int s = 0;
  for (int j=1; j<=i; j++)
     s = s + f;
  f = s;
}

The IJVM has no for statements either, so we need to rewrite it again, this time without for loops. Also, we're dropping the declarator 'int' and the semicolon, since they're redundant at this level.

     f = 1
     i = 2 
loop1: if (i>n) goto done1
     s = 0
     j = 1
loop2: if (j>i) goto done2
     s = s + f
     j = j + 1
     goto loop2
done2: f = s
     i = i + 1
     goto loop1
done1:

At the end of this program, when control reaches the statement 'done2', the variable f will have the desired result. When generating machine language, we know that each line of the Java source program will occupy several lines of code. A comment in the middle column identifies the source statements for each group of code statements. The memory address of each code statement is in the left-hand column, and the machine code is on the right. Local variables and their addresses are listed at the bottom of the program.

          Symbolic Machine                                Actual Machine Code		 
 Address  Instruction                Comment		  Op Addr
 -------  ----------------------     -----------------    -------------------
  00            BIPUSH  1            f = 1                10 01
  02            ISTORE  f            			  36 32	
  04            BIPUSH  2            i = 2                10 02
  06            ISTORE  i				  36 34
  08      loop1:  ILOAD n            if (i>n)             15 33
  0A            ILOAD   i               goto done1        15 34
  0C            ISUB                                      64
  0D            IFLT    done1                             9B 30
  0F            BIPUSH  0            s = 0                10 00 
  11            ISTORE  s				  36 36
  13            BIPUSH  1            j = 1                10 01
  15            ISTORE  j				  36 35
  17      loop2:  ILOAD i            if (j>i)             15 34
  19            ILOAD   j               goto done2        15 35
  1B            ISUB					  64
  1C            IFLT    done2				  9B 29
  1E            ILOAD   s            s = s + f            15 36
  20            ILOAD   f                                 15 32
  22            IADD					  60
  23            ISTORE  s				  36 36
  25            IINC    j            j = j + 1		  84 35
  27            GOTO    loop2				  A7 17
  29      done2:  ILOAD s            f = s		  15 36
  2B            ISTORE  f				  36 32
  2D            IINC    i            i = i + 1		  84 34
  2F            GOTO    loop1				  A7 08
  31      done1:  IRETURN				  AC
  32      f:                               local variables
  33      n:
  34      i:
  35      j:
  36      s:

Note that the hex addresses 'Address' shown on the left are for this program's instructions, once they are translated (by hand!) to machine code.

CS220 - Design of a Complete CPU and Memory System October 10, 2003

Instruction Set

CS220 - Design of a Complete CPU and Memory System
October 10, 2003