3.1. Arithmetic and Logic OperationsReading Assignments and Exercises
3.1.1. Boolean Addition
Figure 3.1. Example of Boolean addition with carry propagation, adapted from [Maf01].
3.1.2. Boolean Subtraction
Figure 3.2. Example of Boolean subtraction using (a) unsigned binary representation, and (b) addition with twos complement negation - adapted from [Maf01].
Figure 3.3. Example of overflow in Boolean arithmetic, adapted from [Maf01].
3.1.4. MIPS Overflow Handling
$epcstores the address of the instruction that caused the interrupt, and the instruction
mfc register, $epc
$epcto register. For example, register could be
$t1. This is an efficient approach, since no conditional branch is needed to test for overflow.
subinstructions) raise exceptions on overflow. In contrast, unsigned arithmetic (
addiu) instructions do not raise an exception on overflow, since they are used for arithmetic operations on addresses (recall our discussion of pointer arithmetic in Section 2.6). In terms of high-level languages, C ignores overflows (always uses
subu), while FORTRAN uses the appropriate instruction to detect overflow. Figure 3.4 illustrates the use of conditional branch on overflow for signed and unsigned addition operations.
Figure 3.4. Example of overflow in Boolean arithmetic, adapted from [Maf01].
3.1.5. Logical OperationsLogical operations apply to fields of bits within a 32-bit word, such as bytes or bit fields (in C, as discussed in the next paragraph). These operations include shift-left and shift-right operations (
srl), as well as bitwise and, or (
ori). As we saw in Section 2, bitwise operations treat an operand as a vector of bits and operate on each bit position.
receiveris formed from an 8-bit field called receivedByte and two one-bit fields called ready and enable. The C routine sets
receiver.readyto 0 and
Figure 3.5. Example of C bit field use in MIPS, adapted from [Maf01].
sllinstruction loads the contents of
$s1(the receiver) into
$s0(the data register), and the result of this is shown on the second line of the register contents. Next, the
$s024 bits, thereby discarding the enable and ready field information, leaving just the received byte. To signal the receiver that the data transfer is completed, the
oriinstructions are used to set the enable and ready bits in
$s1, which corresponds to the receiver. The data in
$s0has already been received and put in a register, so there is no need for its further manipulation.
3.2. Arithmetic Logic Units and the MIPS ALUReading Assignments and Exercises
3.2.1. Basic Concepts of ALU Design
3.2.2. 1-bit ALU Design
Figure 3.6. Example of a simple 1-bit ALU, where the oval represents a multiplexer with a control code denoted by Op and an output denoted by C - adapted from [Maf01].
Figure 3.7. Carry-in and carry-out in Boolean addition, adapted from [Maf01].
Figure 3.7. Full adder circuit (a) sum-of-products form from above-listed truth table, (b) CarryOut production, and (c) one-bit full adder with carry - adapted from [Maf01].
Figure 3.9. One-bit ALU with three operations: and, or, and addition: (a) Least significant bit, (b) Remaining bits - adapted from [Maf01].
3.2.3. 32-bit ALU Design
Figure 3.10. 32-bit ALU with three operations: and, or, and addition - adapted from [Maf01].
3.2.4. MIPS ALU Design
sltinstruction (set on less-than) has the following format:
slt rd, rs, rt
- Step 1. Perform subtraction using negation and a full adder
Step 2. Check most significant bit (sign bit)
Step 3. Sign bit tells us whether or not A < B
slt, we need (a) new input line called Less that goes directly to the mux, and (b) a new control code (111) to select the
sltoperation. Unfortunately, the result for
sltcannot be taken directly as the output from the adder. Instead, we need a new output line called Set that is used only for the
sltinstruction. Overflow detection logic is also associated with this bit. The additional logic that supports
sltis shown in Figure 3.11.
Figure 3.11. One-bit ALU with additional logic for slt operation - adapted from [Maf01].
sltinstruction is (a) augmentation of each of 32 muxes to have three control lines instead of two, (b) augmentation of each of 32 one-bit ALU's control signal structure to have an additional (Less) input, and (c) the addition of overflow detection circuitry, a Set output, and an xor gate on the output of the sign bit.
bneInstruction. Recall the branch-on-not-equal instruction
bne r1, r2, Label, where r1 and r2 denote registers and Label is a branch target label or address. To implement
bne, we observe that the following implication holds:
Figure 3.12. 32-bit ALU with additional logic to support bne and slt instructions - adapted from [Maf01].
srainstructions, these are supported in the ALU under design by adding a data line for the shifter (both left and right). However, the shifters are much more easily implemented at the transistor level (e.g., outside the ALU) rather than trying to fit more circuitry onto the ALU itself.
Figure 3.13. Four bit barrel shifter, where "x >> 1" denotes a shift amount greater than one - adapted from [Maf01].
Figure 3.14. Supporting immediate instructions on a MIPS ALU design, where IR denotes the instruction register, and (/16) denotes a 16-bit parallel bus - adapted from [Maf01].
3.2.5. ALU Performance Issues
P1 = p7 + p6 + p5 + p4
P2 = p11 + p10 + p9 + p8
P3 = p15 + p14 + p13 + p12
G0 = g3 + p3g2 + p3p2g1 + p3p2p1g0
G1 = g7 + p7g6 + p7p6g5 + p7p6p5g4
G2 = g11 + p11g10 + p11p10g9 + p11p10p9g8
G3 = g15 + p15g14 + p15p14g13 + p15p14p13g12
3.3. Boolean Multiplication and DivisionReading Assignments and Exercises
3.3.1. Multiplier Design
Multiplicand: 0010 # Stored in register r1 Multiplier: x 1101 # Stored in register r2 -------------------- Partial Prod 0010 # No shift for LSB of Multiplier " " 0000 # 1-bit shift of zeroes (can omit) " " 0010 # 2-bit shift for bit 2 of Multiplier " " 0010 # 3-bit shift for bit 3 of Multiplier -------------------- # Zero-fill the partial products and add PRODUCT 0011010 # Sum of all partial products -> r3
Figure 3.15. Pencil-and-paper multiplication of 32-bit Boolean number representations: (a) algorithm, and (b) simple ALU circuitry - adapted from [Maf01].
Figure 3.16. Second version of pencil-and-paper multiplication of 32-bit Boolean number representations: (a) algorithm, and (b) schematic diagram of ALU circuitry - adapted from [Maf01].
Figure 3.17. Third version of pencil-and-paper multiplication of 32-bit Boolean number representations: (a) algorithm, and (b) schematic diagram of ALU circuitry - adapted from [Maf01].
Figure 3.18. Booth's procedure for multiplication of 32-bit Boolean number representations: (a) algorithm, and (b) schematic diagram of ALU circuitry - adapted from [Maf01].
3.3.2. Design of Arithmetic Division Hardware
Figure 3.19. Division of 32-bit Boolean number representations: (a) algorithm, (b) example using division of the unsigned integer 7 by the unsigned integer 3, and (c) schematic diagram of ALU circuitry - adapted from [Maf01].
Figure 3.20. Division of 32-bit Boolean number representations: (a) algorithm, and (b,c) examples using division of +7 or -7 by the integer +3 or -3; adapted from [Maf01].
Self-Exercise. Be able to trace each example shown in Figure 3.20b,c through the algorithm whose flowchart is given in Figure 3.20a. Know how each part of the algorithm works, and why it behaves that way. Hint: This exercise, or a part of it, is likely to be an exam question.
srainstructions. The upper (high) 32 bits of the register contains the remainder resulting from division. This is moved into a register in the MIPS register stack (e.g.,
$t0) by the
mfhicommand. The lower 32 bits of the 64-bit register contains the quotient resulting from division. This is moved into a register in the MIPS register stack by the
divinstruction and unsigned division, by the
divuinstruction. MIPS hardware does not check for division by zero. Thus, divide-by-zero exception must be detected and handled in system software. A similar comment holds for overflow or underflow resulting from division.
Figure 3.21. MIPS ALU supporting the integer arithmetic operations (+,-,x,/), adapted from [Maf01].
Self-Exercise. Show how the MIPS ALU in Figure 3.21 supports the integer arithmetic operations (+,-,x,/) using the algorithms and hardware diagrams given thus far. Hint: This exercise, or a part of it, is likely to be an exam question.
3.4. Floating Point ArithmeticReading Assignments and Exercises
3.4.1. Scientific Notation and FP Representation
3.4.2 Overflow and Underflow
doubledatatype, and can support numbers with exponents ranging from -30810 to 30810. The primary advantage is greater precision in the mantissa.
3.4.3. IEEE 754 Standard
Figure 3.22. Denorms: (a) Gap between zero and 2-127, and (b) Denorms close this gap - adapted from [Maf01].
3.4.4. FP Arithmetic
Example 1. Using decimal numbers for clarity, let x = -1.5 · 1038, y = 1.5 · 1038, and z = 1.0. With floating point representation, we have:x + (y + z) = -1.5 · 1038 + (1.5 · 1038 + 1.0) = 0.0and
(x + y) + z = (-1.5 · 1038 + 1.5 · 1038) + 1.0 = 1.0The difference occurs because the value 1.0 cannot be distinguished in the significand of 1.5 · 1038 due to insufficient precision (number of digits) of the significand in the FP representation of these numbers (IEEE 754 assumed).
- Denormalize the operands and shift one of the operands to make the exponents of both numbers equal (we denote the exponent by E).
- Add or subtract the significands to get the resulting significand.
- Normalize the resulting significand and change E to reflect any shifts incurred by normalization.
3.5. Floating Point in MIPSReading Assignments and Exercises
div.s, whereas double precision instructions are
div.d. These instructions are much more complicated than their integer counterparts. Problems with implementing FP arithmetic include inefficiencies in having different instructions that take significantly different times to execute (e.g., division versus addition). Also, FP operations require much more hardware than integer operations.
$f1, ..., etc. Most of these registers are specified in the
.dinstructions. Double precision operands are stored in register pairs (e.g.,
mtc0, etc.), where cn refers to coprocessor #n. Similarly, special I/O operations are required to load and store data between the coprocessor and memory (e.g.,
Figure 3.23. MIPS ALU supporting floating point addition, adapted from [Maf01].
lwc1instruction with the global pointer as base address and the variables
const9as offsets. The single precision division operation puts the quotient of 5.0/9.0 into
$f16, and the remainder of the computation is straightforward. As in all MIPS procedure calls, the
jrinstruction returns control to the address stored in the