CS220 - Design of a Complete CPU and Memory System
October 10, 2003

This document summarizes what will serve as a basic design of a complete CPU and memory system that we can aim toward implementing in LogicWorks. The architecture shown on page 205, together with the ALU and Memory units you designed in earlier labs, provide the basis for this design.
All registers shown here are 8-bit (1-byte) registers.  Registers PC and MBR contain the address and value of the instruction currently being executed, while registers MAR and MDR contain the address and value of an operand of the instruction currently being executed. Register H is an intermediate register that is useful for various arithmetic operations.

This design has four buses.  The B and C buses connect the registers to the ALU.  In each clock cycle, a register is connected to the ALU through the B bus, an operation is performed between it and the contents of the H register, and the result is connected to one or more of the registers through the C bus. 

The A and D buses connect the CPU to memory.  The A bus contains the address of a memory byte to be read or written, and the D bus contains the value to be read or written.  Whether or nt

The memory is a 256-byte memory and is byte-addressible, so that a memory address is 8 bits and the range of memory addresses (the "address space") is 0-255. 



The memory is divided into two conceptual parts, the "code" and the "stack." The code is the IJVM bytecode program itself, and it occupies a block of memory locations beginning at address 0.  The stack contains the local data for the program and occupies a block of memory locations beginning at address 128. The bottom value on the stack is at address 128, while the top value on the stack is at the address in register SP. A copy of that top value is always maintained in register TOS.


Instruction Set

A machine instruction is either 1 byte long or 2 bytes long, depending on whether or not it either addresses memory or has an immediate operand. If it is an add, subtract, and, or, pop, dup, swap, return, or nop instruction it is 1 byte long, as shown below:

Here, Op identifies the 8-bit operation code for one of these instructions. When writing an assembly language instruction like

IADD

The interpretation of IADD is that the top two bytes on the stack are popped and added, and this sum is pushed on top of the stack in their place. All numeric data in this machine is stored as 8-bit 2's complement integers. Here is a summary of the 1-byte instructions and their meaning.

		  Meaning (Clock cycles):
Symbol Op(Hex) T1 T2 T3 T4
------ ---- ----------- --------- ------------- -------
IADD 60 MAR=SP=SP-1 H=TOS; rd MDR=TOS=MDR+H wr; End
ISUB 64 MAR=SP=SP-1 H=TOS; rd MDR=TOS=MDR-H wr; End
IAND 7E MAR=SP=SP-1 H=TOS; rd MDR=TOS=MDR&&H wr; End
IOR 80 MAR=SP=SP-1 H=TOS; rd MDR=TOS=MDR||H wr; End
POP 57 MAR=SP=SP-1 rd TOS=MDR; End
IRETURN AC Pwroff
The notation x = y means "assign the value of y to x" (as in Java or C), the notation "x; y" means "do x and y in the same clock cycle," and the notation "x, y" means do x and then y in two separate clock cycles." Also, "rd" means "do a memory read" and "wr" means "do a memory write." Finally, the notation "x=y=z" is like an ordinary assignment, evaluated from right to left.

For example, execution of IADD takes place in 4 clock cycles (decrementing the SP, transferring the second operand into H from TOS and reading the first operand from memory, adding the two operands and storing the result into TOS, and writing this result onto the stack.

If the instruction is a stack push, goto, if, increment, load, or store instruction, it requires two bytes in memory, one for the op code and the other for the memory address or value of the operand. These instructions have the following layout:

Here is a summary of the 2-byte instructions and their meanings.

	  Meaning (Clock cycles):
Symbol Op T1 T2 T3 T4 T5 T6
---------------------------------------------------------------------------------------
ILOAD  15 PC=PC+1; fetch MAR=MBR rd TOS=MDR MAR=SP=SP+1 wr; End
BIPUSH 10 PC=PC+1; fetch MDR=TOS=MBR MAR=SP=SP+1 wr; End
ISTORE 36 PC=PC+1; fetch MAR=MBR MDR=TOS MAR=SP=SP-1; wr rd TOS=MDR; End
GOTO A7 PC=PC+1; fetch PC=MBR
IFEQ 99 PC=PC+1; fetch TOS = TOS Z->PC=MBR-1 MAR=SP=SP-1 rd TOS=MDR; End
IFLT 9B PC=PC+1; fetch TOS = TOS, N->PC=MBR-1 MAR=SP=SP-1 rd TOS=MDR; End
IINC 84 PC=PC+1; fetch MAR=MBR rd MDR=MDR+1 wr; End
At the beginning of execution for each of these 13 instruction types, a single clock cycle, T0, fetches the instruction and then branches to one of the above sequences.  It's definition is:

T0
-------------------------
PC=PC+1; fetch; goto (OP)

T0 is also the clock cycle to which each of the above sequences returns when the End switch is set.  The fetch-execute cycle for one instruction takes from 1 to 6 clock cycles, depending on the instruction type.



To illustrate the idea of programming at this level of language, here are some simple examples.

The Java statement i=i+j, where i is an int variable, can be broken down into a series of IJVM instructions. Since an arithmetic operation itself cannot be performed unless both operands are on the stack, we need the following sequence:

ILOAD i
ILOAD j
IADD
ISTORE i

Similarly, the Java instruction if (i>=j) k=k-1; can be coded in the following sequence of instructions:

ILOAD i
ILOAD j
ISUB
IFLT skip ; if i<j we can skip the assignment k=k-1
BIPUSH 1 ; otherwise, we do the assignment
ILOAD k
ISUB
ISTORE k
skip:


Here is a more interesting example, which illustrates the process of breaking a program down into a sequence of more elementary steps that can be handled by our new machine architecture.  Consider the following Java code that computes the factorial f of an integer n.
f = 1;
for (int i=2; i<=n; i++)
f = f * i;
The IJVM has no multiplication (IMUL) instruction, so we need to rewrite our loop multiplication:
f = 1;
for (int i=2; i<=n; i++) {
int s = 0;
for (int j=1; j<=i; j++)
s = s + f;
f = s;
}
The IJVM has no for statements either, so we need to rewrite it again, this time without for loops. Also, we're dropping the declarator 'int' and the semicolon, since they're redundant at this level.
     f = 1
i = 2
loop1: if (i>n) goto done1
s = 0
j = 1
loop2: if (j>i) goto done2
s = s + f
j = j + 1
goto loop2
done2: f = s
i = i + 1
goto loop1
done1:
At the end of this program, when control reaches the statement 'done2', the variable f will have the desired result. When generating machine language, we know that each line of the Java source program will occupy several lines of code. A comment in the middle column identifies the source statements for each group of code statements. The memory address of each code statement is in the left-hand column, and the machine code is on the right. Local variables and their addresses are listed at the bottom of the program.
          Symbolic Machine                                Actual Machine Code		 
Address Instruction Comment Op Addr
------- ---------------------- ----------------- -------------------
00   BIPUSH 1 f = 1  10 01
02 ISTORE f 36 32
04  BIPUSH 2 i = 2   10 02
06 ISTORE i 36 34
08 loop1:  ILOAD n if (i>n)  15 33
0A ILOAD  i goto done1  15 34
0C ISUB 64
0D IFLT done1  9B 30
0F   BIPUSH 0 s = 0  10 00
11 ISTORE s 36 36
13   BIPUSH 1 j = 1   10 01
15 ISTORE j 36 35
17 loop2:  ILOAD i if (j>i)  15 34
19  ILOAD  j goto done2  15 35
1B ISUB 64
1C IFLT done2 9B 29
1E  ILOAD s s = s + f  15 36
20   ILOAD f 15 32
22 IADD 60
23 ISTORE s 36 36
25 IINC j j = j + 1 84 35
27 GOTO loop2 A7 17
29 done2:  ILOAD s f = s 15 36
2B ISTORE f 36 32
2D IINC i i = i + 1 84 34
2F GOTO loop1 A7 08
31 done1: IRETURN AC
32 f: local variables
33 n:
34 i:
35 j:
36 s:
Note that the hex addresses 'Address' shown on the left are for this program's instructions, once they are translated (by hand!) to machine code.