This is part of the "microprogramming notes" web pages, which contain the
following subtopics:
- Simple one-bus CPU architecture
- Register in / out connections
- Microcode examples <-- you are here
- ALU architecture (insides)
- ALU interface (outsides)
- Control circuitry
- Simple two-bus CPU architecture
- Simple three-bus CPU architecture (used in RISCs)
Microcode examples
Here are a few microcode examples.
For the simplest control sequence example, let's do a transfer from register
R0 to R1, like the PDP-11 instruction "MOV R0, R1".
This is a single step:
0. R0out, R1in
Another control sequence example:
R2 <- R1+R2
This will illustrate the use of the
Y and Z registers.
The ALU does its operation on whatever's on the bus and something in
a special "other ALU operand" register, which is Y.
The ALU output goes into a special result register, namely Z.
This is how we use the ALU:
0. R1out, Yin
1. R2out, Add, Zin
2. Zout, R2in
Step 0 transfers (copies) the value in R1 into Y.
Step 1 puts the R2 value on the bus, and does an Add, thus adding R1+R2.
This result is loaded into Z.
Step 2 transfers (copies) the value in Z into R2.
The meaning of these "microinstruction" steps listed above is that
all control signals which we do not list for each step are turned off.
All control signals we do list are turned on.
Thus the order in which we list these signals is unimportant.
This is crucial, and perhaps subtle.
This notation is like an assembly notation of sorts. The actual meaning is that these control signals
are on at the given time and all others not listed are off. The order in which you write them is
irrelevant. The sequence is from step to step, not from left to right.
Why have a control for Zin? It's not a connection to the bus; who cares if Z is just
getting set by the ALU all the time even if we don't want the result?
Well, since Z is a register, this additional control line means we don't
have to pick up the Z value
right away. For example, we could add three numbers by leaving the sum of two of them in Z:
Suppose they're P+Q+R:
0. Pout, Yin
1. Qout, Add, Zin
2. Rout, Yin
3. Zout, Add, Zin
This only works if Z is a master-slave-flip-flop register, so that it can be set in the same cycle that
its previous value can be used.
But we will assume that all of our registers are master-slave unless otherwise specified.
Note that this logic is rife with considerations of clock timing being "long enough" for things to
propagate, the ALU to settle down, etc. This puts a cap on CPU clock speed. Each step is one
clock cycle, but we may have different parts of each step happen at slightly different times during
the clock cycle. It depends on the CPU design.
A full example
Example: the execution of the entire instruction "ADD something, R1".
ADD 1000, R1
means R1 <- M[1000]+R1
0. PCout, MARin, Read, Zero A, Set Carry-In, Add,
Zin
1. Zout, PCin, Wait MFC
2. MDRout, IRin
3. AddressFieldOfIRout, MARin, Read
4. R1out, Yin, Wait MFC
5. MDRout, Add, Zin, Set CC
6. Zout, R1in, End
Detailed description:
- PCout, MARin, Read
- These initiate the fetch of the instruction. We put the desired
memory address into the MAR and assert the Read control line.
We want to read from the memory address which is the contents of the PC
register because the PC register (which
may or may not be R7; we're not designing control sequences specifically
for the PDP-11 here) contains the address of the next instruction to
execute; that's what the PC register is for.
- Zero A
- We'll want a control line which uses 0 rather than Y for A.
Thus we can perform ALU operations, so long as they involve 0 as the first
operand, without an extra cycle to set up Y.
We normally design the control sequences and the hardware in tandem.
You can see how to implement a "Zero A" control line -- it is inverted, and
ANDed with all the lines coming out of Y.
Thus when it is 0, we AND with 1, which does nothing;
but when it is 1, we AND with 0, which sets all of them to zero.
But "Zero A" does not affect the value stored in Y, it just gives us a zero
instead of the value of Y for the ALU's first operand.
- Zero A, Set Carry-In, Add, Zin
- We have to increment the PC, so that its contents contain the address
of the next instruction for the next time we do this. This whole sequence
will be in a big loop.
Note that we are simultaneously fetching the instruction and adding 1 to
the PC. We do as much as possible in a single step.
Now, for step 1:
- Zout, PCin
- This is a bus transfer from Z to PC. Z already contains the value PC+1.
We copy this into the PC register, thus completing the increment of the PC.
- Wait MFC
- Recall that a memory operation (the Read in step 0) may take longer
than one CPU cycle. We haven't needed the result yet, but we want to use
the result (i.e. the contents of the MDR) in step 2. So we have to say
that we should pause after step 1 until we have the MFC (memory function
complete) signal from the main memory unit.
We have a control signal to accomplish this, whose implementation we will
examine later in this document.
- Step 2. MDRout, IRin
- Now that we have the instruction, we transfer it to the IR, which is
where we put the instruction we're currently executing. The IR has extra
circuitry attached which when we do a transfer into the IR,
decodes the instruction.
These three steps are common to the execution of any instruction.
They're the "instruction fetch" and "increment PC".
Note how we do as many things simultaneously as possible.
We did the memory read request in step 0, but while waiting for the result
to come back, we stored the PC. Perhaps the result came back before our
step 1 completed. If so, no loss. But if not, we managed to sneak in some
extra processing for free.
The above may give some additional insight into why, on the PDP-11 for
example, the various offsets are relative to the new PC after the
fetch. We haven't started specifically executing the ADD instruction yet,
we've just done the fetch, and already the PC is incremented.
Now, continuing on specifically with the ADD instruction:
- 3. AddressFieldOfIRout, MARin, Read
- The decoding circuitry extracts the Address field in the instruction,
which in this case is the value 1000. We are assuming a far simpler
instruction set than the PDP-11 addressing modes; the bits for that 1000 value
are right there in the one-word instruction, and it is an absolute address, not
a relative one.
Coming out of the decoding circuitry are some lines representing only the
address field of the IR's contents, and these are tri-stated onto the bus.
Thus they have an "out" line.
And this is the address from which we want to read to do the operand
fetch, so we transfer it into the MAR and assert the Read control line.
- 4. R1out, Yin, Wait MFC
- One of the operands is the contents of R1. For addition, it doesn't matter which
operand is which; so again we follow our strategy of trying to get other
stuff done while the Read operation might still be in progress.
One of the two operands for the ALU Add operation should be on the bus,
the other in the Y register.
We put the contents of R1 into the Y register rather than the contents of
M because the contents of R1 are already
handy and it gives us something productive to do while perhaps still
waiting for the memory operation to complete. However, we do need the
step 3's
memory operation's results for step 5, so we also include the Wait MFC
control line at this point.
- 5. MDRout, Add, Zin, Set CC
- With the MDR contents on the bus, we can do an Add and put the result
into Z. We can't put the result directly on the bus, because the contents
of MDR are
already there. So we don't even have a control line to put the ALU result
directly on the bus, because we could never use it -- the ALU's second
operand is already there.
The "Set CC" control line causes the condition codes to be affected by the
ALU operation.
We want this because they should be set according to what happens with
this addition; this is the addition which is the one of interest from the
point of view of the machine-language programmer. The earlier addition,
in the standard fetch sequence, is not something which should change the
condition codes. So we didn't list "Set CC" on that line.
Unless we list "Set CC", it is off, and it controls the LOAD line for the
condition code register, so without Set CC, the condition codes stay as
they were despite whatever use we make of the ALU.
- 6. Zout, R1in, End
- Transfer the Z contents to R1, thus completing the operation.
This step also introduces the "End" control line, which indicates the end
of this instruction execution.
Thus the End control line means to
go to step 0, although perhaps for a
different instruction next time.
(We'll clarify this below.)
Can we save a step anywhere here? if we could, we would speed up every
single ADD instruction in the computer! But no, note that every step has
a register "out" control on, and they're all different.
Another example:
The PDP-11
BLT X instruction, where X is an offset. Branch if condition code bit N is 1.
Unlike with VELMA, the X value is added to the current contents of the
PC register.
0, 1, 2. standard fetch sequence
3. PCout, Yin, If then End
4. AddressFieldOfIRout, Add, Zin
5. Zout, PCin, End
The 'If' there is done in hardware, with control lines.
The method is described in the "control circuitry" section later in these web
pages.
The Address Field of the IR in this case is the value 'X', not actually an
address.
In step 3, we basically do the If then End.
However, we also try to get some other work done in the event that we are
going to have to do the branch. If N is 0, we're done, it doesn't matter what
else we do so long as we don't break anything.
So we're putting something into Y, but if N is 0 this is not going to have any
lasting effect.
The rationale for transferring the contents of the PC register into Y only
becomes apparent in the case where N is 1, and we don't abort the
microroutine there. In this case, we need to add that value X to the PC
value, since X is an offset.
That is, the target branch location is the contents of PC+X.
Where X is a certain range of bits of the machine language instruction stored
in the IR.
For simplicity, let's assume that the branch instructions' offset is in the
same range of bits as the data operations instructions' memory address field.
So AddressFieldOfIRout gives us our X value on the bus, which we
can add to the PC value to get the appropriate branch target.
In step 5, we put this calculated value into the PC register, thus
accomplishing the branch, and we assert the End control line, which resets the
µPC thus restarting the standard fetch sequence which will fetch and
execute the instruction from this address which is the new value in the PC
register.
This is part of the "microprogramming notes" web pages, which contain the
following subtopics:
- Simple one-bus CPU architecture
- Register in / out connections
- Microcode examples <-- you are here
- ALU architecture (insides)
- ALU interface (outsides)
- Control circuitry
- Simple two-bus CPU architecture
- Simple three-bus CPU architecture (used in RISCs)
[list of course notes topics available]
[main course page]