Sunday, 6 October 2013

Microprogramming

We usually think of control signals as implemented by discrete logic, even if the implementation makes use of PALs or PLAs. Microprogramming, on the other hand, is an approach for implementing processor control in which the output signals are stored within a ROM.

The two main variations of microprogramming are the horizontal and vertical methods. In the previous section, we already saw some distinction between horizontal and vertical next-state organizations. In horizontal microprogramming, there is one ROM output for each control point in the data-path. Vertical microprogramming is based on the observation that only a subset of these signals is ever asserted in a given state. Thus, the control outputs can be stored in the ROM in an encoded form, effectively reducing the width of the ROM word at the expense of some external decoding logic.

Encoding the control signals may limit the data-path operations that can take place in parallel. If this is the case, we may need multiple ROM words to encode the same data-path operations that could be performed in a single horizontal ROM word.

For example, consider a microprogrammed control for a machine with four general-purpose accumulators. Most computer instruction formats limit the destination of an operation to a single register. Realizing this, you may choose to encode the destination of a register transfer operation in 2 bits rather than 4. The destination register select line is driven by logic that decodes these 2 bits from the control. Thus, at any given time, only one of the registers is selected as a destination.

The art of engineering a microprogrammed control unit is to strike the correct balance between the parallelism of the horizontal approach and the ROM economy of a vertical encoding. For example, the encoded register enable lines eliminate the possibility of any state loading two registers at the same time, even if this was supported by the processor data-path. If a machine instruction must load two registers, it will require multiple control states (and ROM words) to implement its execution.

We begin our study with the horizontal approach to microprogramming. We will see that the instruction set and the data-path typically do not support the full parallelism implied by horizontal control, so we will examine methods of encoding the ROM word to reduce its size.


12.5.1 Horizontal Microprogramming

The horizontal next-state organization of Figure 12.22 offers the core of a horizontal microprogrammed controller. An extremely horizontal control word format would have 1 bit for each microoperation in the data-path. Let's develop such a format for the simple CPU's control.

Our sample processor supports 14 register transfer operations. We further decompose these into 22 discrete microoperations (ordered by destination):
PC --> ABUS
IR --> ABUS
MBR --> ABUS
RBUS --> AC
AC --> ALU A
MBUS --> ALU B
ALU ADD
ALU PASS B
MAR --> Address Bus
MBR --> Data Bus
ABUS --> IR
ABUS --> MAR
Data Bus --> MBR
RBUS --> MBR
MBR --> MBUS
0 --> PC
PC + 1 --> PC
ABUS --> PC
Read/
Request
AC --> RBUS
ALU Result --> RBUS
A very long ROM word for a four-way branch sequencer would have a and b multiplexer bits, four 4-bit next states, and 22 microoperation bits. This yields a total ROM word length of 40 bits, as shown in Figure 12.23.

Figure 12.24 gives the ROM contents for the Moore controller of Figure 12.1 (the branch sequencer of Figure 12.22 implements a Moore machine). The a multiplexer inputs are Sel0 = Wait, Sel1 = IR<15> and the b inputs are Sel0 = AC<15>, Sel1 = IR<14>. We assume the next-state register is reset directly.

The multiplexers on the next state work just as in Figure 12.22. For example, consider state IF1. We stay in this state if Wait is unasserted. If Wait is asserted, we advance to state IF2. The a and b mux controls are set to examine Wait and AC<15>, respectively. Thus, the 0X next-state bits (A0, A1) are set to the encoding for IF1, 0010. Similarly, the 1X next states (A2, A3) are set to the encoding for IF2, 0011.

Reducing ROM Word Width Through Encoding The horizontal approach offers the most flexibility by providing access to all of the data-path control points at the same time. The disadvantage is the width of the ROM word, which can exceed a few hundred bits in complex controllers.

A good way to reduce the ROM size is by encoding its output. This need not lead to an inherent loss of parallelism. After all, certain control combinations may not make sense logically (for example, 0 --> PC and PC + 1 --> PC are logically exclusive) or might be ruled out by the data-path busing strategy (for example, PC --> ABUS and IR --> ABUS cannot take place simultaneously).

Furthermore, the ROM contents of Figure 12.24 are very sparse. In any given state, very few of the control signals are ever asserted. This means we can group the control signals into mutually exclusive sets to encode them. We decode them outside the ROM with additional hardware.

For example, the three PC microoperations, 0 --> PC, PC + 1 --> PC, and ABUS --> PC, are never asserted in the same state. For the cost of an external 2-to-4 decoder, a ROM bit can be saved by encoding the signals as follows:

00

No PC control

01

0 --> PC

10

PC + 1 --> PC

11

ABUS --> PC

There are many other plausible encoding strategies for this controller. MAR --> Address Bus and Request are always asserted together, as are RBUS --> AC, MBUS --> ALU B, MBR --> MBUS, and ALU --> RBUS. If we have designed the ALU to pass its A input selectively, we can combine AC --> ALU A in state LD2 with this list of signals. As another example, we can combine MBR --> ABUS and ABUS --> IR. Taken together, these encodings save six ROM bits.

We can save additional ROM bits by finding unrelated signals that are never asserted at the same time. These are good candidates for encoding. For example, we can combine PC --> ABUS, IR --> ABUS, and Data Bus --> MBR, encoding them into 2 bits. Applying all of these encodings at the same time yields the encoded control unit in Figure 12.25. The direct ROM outputs have been reduced from 22 to 15.

As more control signals are placed in the ROM in an encoded form, we move from a very horizontal format to one that is ever more vertical. We present a systematic approach to vertical microprogramming next.


12.5.2 Vertical Microprogramming

Vertical microprogramming makes more use of ROM encoding to reduce the length of the control word. To achieve this goal, we commonly use multiple microword formats. For example, many states require no conditional next-state branch; they simply advance to the next state in sequence. Rather than having every microword contain a next state and a list of microoperations, we can shorten the ROM word by separating these two functions into individual microword formats: one for conditional "branch jumps" and another for register transfer operations/microoperations.

Shortening the ROM word does not come free. We may need several ROM words in a sequence to perform the same operations as a single horizontal microword. The combination of extra levels of decoding, multiple ROM accesses to execute a sequence of control operations, and sacrifice of the potential parallelism of the vertical approach leads to slower implementations. The basic machine cycle time increases, and the number of machine cycles to execute an instruction also increases.

Despite this inefficiency, designers prefer vertical microcode because it is much like coding in assembly language. So the trade-off between vertical and horizontal microcode is really a matter of ease of implementation versus performance.

Vertical Microcode Format for the Simple CPU Let's develop a simple vertical microcode format for our simple processor. We will introduce just two formats: a branch jump format and a register transfer/operation -format.

In a branch jump microword, we include a field to select a signal to be tested (Wait, AC<15>, IR<15>, IR<14>) and the value it should be tested against (0 or 1). If the signal matches the specified value, the rest of the microword contains the address of the next ROM word to be fetched. The condition selection field can be 2 bits in length; the condition comparison field can be 1 bit wide.

The register transfer/operation microword contains three fields: a register source, a register destination, and an operation field for instructing functional units like the ALU what to do. To start, let's arrange the microoperations according to these categories:

Sources:
PC --> ABUS
IR --> ABUS
MBR --> MBUS
AC --> ALU A
MAR --> Mem Address Bus
MBR --> Mem Data Bus
MBR --> MBUS
AC --> RBUS
ALU Result --> RBUS
Destinations:
RBUS --> AC
MBUS --> ALU B
MBUS --> IR
ABUS --> MAR
Mem Data Bus --> MBR
RBUS --> MBR
ABUS --> PC
Operations:
ALU ADD
ALU PASS B
0 --> PC
PC + 1 --> PC
Read (Read, Request)
Write (, Request)
We can encode the nine sources in a 4-bit field, the seven destinations in 3, and the six operations also in 3 (we have combined Read/ and Request in the operation format).

It would certainly be convenient to encode all the fields in the same number of bits. At the moment, we have several more sources than destinations. A close examination of the data-path of Figure 11.26 indicates that we can do better at encoding the destinations. We can assume that the AC is hardwired to the ALU A input, just as the MBUS is wired to the ALU B input. Also, the MBR is the only source on the MBUS, so we can eliminate the microoperation MBR --> MBUS. This gives us seven sources and six destinations, easily encoded in 3 bits each.

There is still one hitch. On writes to memory, such as during a store, the MAR must drive the memory address lines and the MBR must drive the data lines. But as listed above, these two microoperations are now mutually exclusive.

Fortunately, there is a reasonable solution. We can move the operation MBR --> Mem Data Bus from the sources to the destinations, simply by thinking of the memory as a destination rather than the MBR as a source. The encoding of the two formats can fit in a very compact 10 bits, as shown in Figure 12.26.
We show the ROM contents for the Moore controller in Figure 12.27.

We handle Reset externally. The symbolic format should be intuitively obvious and bears a striking resemblance to assembly language programs. The two alternative formats are denoted by BJ for branch jump and RT for register transfer. The former is written as the condition followed by the next address. For example,
BJ Wait = 0, IF0

is a branch jump microinstruction that tests whether the Wait signal is unasserted. If it is, the microinstruction causes the next microinstruction to be fetched from the ROM location with label IF0.

The RT format is written as SRC --> DST followed by any operations that are performed in parallel with the register transfer. For example,
RT PC --> MAR, PC + 1 --> PC
is a register transfer operation that maps onto the microoperations PC --> ABUS, ABUS --> MAR, and PC + 1 --> PC.

Discussion Figure 12.27 leads us to a few observations. First, we have not included an unconditional branch operation in our microcode instruction set. This is handled by two BJ instructions in sequence, testing a condition and its complement, branching to the same place in both cases. Obviously, the microinstruction format could be revised to include such a branch.

Second, it is important that signals to the outside world, such as those that connect the MAR and MBR to external buses and those that drive the Read/ and Request lines, be latched at the controller output. We need this because RT operations that assert these signals are usually followed by BJ operations testing the Wait signal. To implement the handshake with external devices correctly, we must hold the external signals until we encounter the next RT operation. Simple registers for the control outputs, loaded only when executing an RT microoperation, will do the job.

The microprogram requires 31 words by 10 ROM bits, or a total of 310 bits. The horizontal implementation described previously used 16 ¥ 38 bit ROM words (16 next-state bits plus 22 microoperation bits), yielding 608 ROM bits. The vertical format is highly efficient in terms of the number of ROM bits required to implement this particular controller. A good part of this savings comes from the separate branch jump format. We use it only in cases that loop in a state or jump out of sequence.

Vertical Microcode Controller Implementation Details We show a straightforward implementation of the vertical microprogrammed controller in Figure 12.28.
We implement the next-state register as a microprogram counter, with CLR, CNT, and LD. A conditional logic block determines whether to assert CNT or LD based on the microinstruction type and the conditions being tested. External decoders map the encoded register transfer operations onto the microoperations supported by the data-path.

The condition block logic is shown in Figure 12.29. The condition selector bits from the microinstruction select one of four possible signals to be tested. The selected condition is compared with the specified bit. The µPC load signal is asserted if the current microinstruction is a branch jump (type 1) and the condition and the comparator bit are identical. The value to be loaded comes from the low-order 6 bits of the microinstruction. The count signal is asserted if the instruction type is register transfer (type 0) or the condition and the comparator bit are -different.


12.5.3 Writable Control Store

Control store need not be fixed in ROM. Some computers map part of the control store addresses onto RAM, the same memory that programmers use for instructions and data. This has the added flexibility that assembly language programmers can write their own microcode, extending the machine's "native" instruction set with special-purpose -instructions.

Of course, most programmers are not sophisticated enough to write their own microcode, yet many machines with complex instruction sets still provide a writable control store. The reason is simple. Since the microprogram for a complex state machine is itself rather complex, it is not uncommon for it to be filled with bugs. Having a writable control store makes it easy to revise the machine control and update it in the field. At power-up time, the machine executes a "boot" microprogram sequence from ROM that loads the rest of the microcode into RAM from an external device like a floppy disk.