In this section we look at the design of multiplier circuitry.
The methods we introduce are combinational, although alternative methods
based on circuits with state are also possible. However, the fastest circuits
for multiplication use just the techniques we will be discussing here.
Basic Concept Throughout this section, we will look only at multiplication techniques for unsigned numbers. Alternatively, the hardware we present is suitable for sign and magnitude multiplication, but we concentrate on the manipulation of the magnitude part. Recall that the two numbers involved in a multiplication are called the multiplicand and the multiplier.
The process of binary multiplication is best illustrated
with an example. In this case, the multiplicand is 11012
Each bit of the multiplier is multiplied against the
multiplicand, the product is aligned according to the position of the bit
within the multiplier, and the resulting products are then summed to form
the final result. One attraction of binary multiplication is how easy it
is to form these intermediate products: if the multiplier bit is a 1, the
product is an appropriately shifted copy of the multiplicand; if the multiplier
bit is a 0, the product is simply 0.
For an n-bit multiplicand and multiplier, the resulting product will be 2n bits. Stated differently, the product of 2n and 2n is 2n + n = 22n, a 2n-bit number. Thus, the product of two 4-bit numbers requires 8 bits, of two 8-bit numbers requires 16 bits, and so on.
Partial Product Accumulation We can construct a combinational circuit that directly implements the process described by the preceding example. The method is called partial product accumulation.
First, we rewrite the multiplicand bits as A3,
A2, A1, A0 and the multiplier bits as B3,
B2, B1, B0. The multiplication of A
and B becomes
Each of the ANDed terms is called a partial product.
The resulting product is formed by accumulating down the columns of partial
products, propagating the carries from the rightmost columns to the left.
A combinational circuit for implementing the 4-bit
multiplier is shown in Figure 5.28.
The first level of 16 AND gates computes the individual partial products.
The second- and third-level logic blocks form the accumulation of the products
on a column-by-column basis. The column sums are formed by a mixture of
cascaded half adders and full adders. In the figure, inputs from the top
are the bits to be added and the input from the right is the carry-in. The
output from the bottom is the sum and to the left is the carry-out.
To see how the partial products are accumulated, let's look at the circuit of Figure 5.28 in a little more detail. S1 is the straightforward sum of just two partial products, A1 · B0 and A0 · B1. S2 is the sum of three products. We implement this with two cascaded adders, one of which takes the carry-out from S1's column.
S3 is a little more complicated, because there are two different carry-outs from the previous column. We use three cascaded adders, two full adders and one half adder, to implement the sum. The two carry-outs from S2 are accumulated through the carry-in inputs of the two full adders.
S4 is the sum of three products and three possible carry-outs from S3. The carries make this case more complicated than the S2 sum. The solution is to implement the sum with three adders-two full adders and one half adder. The two full adders sum the three products and two of the carry-outs. The half adder adds to this result the third possible carry-in.
The logic for S5 is similar. Here we must sum two products and three carries. Two full adders do the job. Note that the second full adder sums two of the three carries from the previous column with the result of the first full adder. A similar analysis applies to S7 and S6.
The delay through the multiplier is determined by the ripple carries be-tween the adders. We can use a carry look-ahead scheme to reduce these delays.
Clearly, the full combinational multiplier uses a lot of hardware. The dominating costs are the adders-four half adders and eight full adders. To simplify the implementation slightly, a designer may choose to use full adders for all of the adder blocks, setting the carry input to 0 where the half adder function is required. Given the full adder schematic of Figure 5.7, this is 12 adders of six gates each, for a total of 72 gates. When we add to this the 16 gates forming the partial products, the total for the whole circuit is 88 gates. It is easy to see that combinational multipliers can be justified only for the most high performance of -applications.
A slightly different implementation of the 4-by-4 combinational
multiplier is shown in Figure 5.29.
Figure 5.29
TTL Multiplier Components The
TTL components 74284 and 74285 provide a two-chip implementation of a 4-by-4
parallel binary multiplier.
Figure 5.30 illustrates their use. The 74284 component implements the
high-order 4 bits of the product, while the 74285 implements the low-order
bits. Both chips have dual active low output enable signals, and . When both enables are 0, the outputs are valid. Otherwise, they
are in the high-impedance state.
Basic Concept Throughout this section, we will look only at multiplication techniques for unsigned numbers. Alternatively, the hardware we present is suitable for sign and magnitude multiplication, but we concentrate on the manipulation of the magnitude part. Recall that the two numbers involved in a multiplication are called the multiplicand and the multiplier.
(
13)
and the multiplier is 10112 (
11)
:
For an n-bit multiplicand and multiplier, the resulting product will be 2n bits. Stated differently, the product of 2n and 2n is 2n + n = 22n, a 2n-bit number. Thus, the product of two 4-bit numbers requires 8 bits, of two 8-bit numbers requires 16 bits, and so on.
Partial Product Accumulation We can construct a combinational circuit that directly implements the process described by the preceding example. The method is called partial product accumulation.
To see how the partial products are accumulated, let's look at the circuit of Figure 5.28 in a little more detail. S1 is the straightforward sum of just two partial products, A1 · B0 and A0 · B1. S2 is the sum of three products. We implement this with two cascaded adders, one of which takes the carry-out from S1's column.
S3 is a little more complicated, because there are two different carry-outs from the previous column. We use three cascaded adders, two full adders and one half adder, to implement the sum. The two carry-outs from S2 are accumulated through the carry-in inputs of the two full adders.
S4 is the sum of three products and three possible carry-outs from S3. The carries make this case more complicated than the S2 sum. The solution is to implement the sum with three adders-two full adders and one half adder. The two full adders sum the three products and two of the carry-outs. The half adder adds to this result the third possible carry-in.
The logic for S5 is similar. Here we must sum two products and three carries. Two full adders do the job. Note that the second full adder sums two of the three carries from the previous column with the result of the first full adder. A similar analysis applies to S7 and S6.
The delay through the multiplier is determined by the ripple carries be-tween the adders. We can use a carry look-ahead scheme to reduce these delays.
Clearly, the full combinational multiplier uses a lot of hardware. The dominating costs are the adders-four half adders and eight full adders. To simplify the implementation slightly, a designer may choose to use full adders for all of the adder blocks, setting the carry input to 0 where the half adder function is required. Given the full adder schematic of Figure 5.7, this is 12 adders of six gates each, for a total of 72 gates. When we add to this the 16 gates forming the partial products, the total for the whole circuit is 88 gates. It is easy to see that combinational multipliers can be justified only for the most high performance of -applications.
(
a)
gives the basic building block,
a full adder circuit that sums a locally computed partial product (
X
· Y)
, an input passed into the block from above
(
Sum In)
, and a carry passed from a block
diagonally above. It generates a carry-out (
Cout)
and a new sum (
Sum Out)
. Figure 5.29(
b)
shows the interconnection of 16 of these blocks to implement the full multiplier
function. The Ai values are distributed along block diagonals and
the Bi values are passed along rows. This implementation uses the
same gate count as the previous one: 16 AND gates and 12 adders (
the
top row does not need adders)
.
No comments:
Post a Comment