9.1 Motivation for Optimization
To review, the finite state machine design process consists of(
1)
understanding the problem, (
2)
obtaining a formal description (
ultimately, a symbolic
state transition table)
, (
3)
minimizing the number of states, (
4)
encoding
the states, (
5)
choosing the flip-flops to implement
the state registers, and finally (
6)
implementing
the finite state machine's next-state and output functions. This chapter
starts at step 3 and carries us through to the final implementation at step
6, using methods based on discrete logic gates.9.1.1 Two State Diagrams, Same I/O Behavior
In the age of very large scale integrated circuits, why should we bother to minimize a finite state machine implementation? After all, as long as the input/output behavior of the machine is correct, it really doesn't matter how it is implemented. Or does it?For a particular finite state machine, there are many equivalent forms. Rather than reusing states while deriving the state diagram, you could simply introduce a new state whenever you need one
(
to
keep the number of states finite, you will need to reuse some of them, of
course)
.The two implementations of the state diagrams of Figure 9.1 are certainly not the same. The machine with more states requires more flip-flops and more complex next-state logic.
9.1.2 Advantages of Minimum States
In general, you will find it is worthwhile to implement the finite state machine in as few states as possible. This usually reduces the number of logic gates and flip-flops you need for the machine's implementation.Similarly, judicious mapping between symbolic and encoded states can reduce the implementation logic. For the parity checker, our implementation in Chapter 8 required no gates because we made a good state assignment that naturally matched the control input to the toggle flip-flop.
A state diagram with n states must be implemented with at least k flip-flops, where 2k - 1 < n ð 2k. By reducing the number of states to 2k - 1 or less, you can save a flip-flop. For example, suppose you are given a finite state machine with five state flip-flops. This machine can represent up to 32 states. If you can reduce the number of states to 16 or less, you save a flip-flop.
Even when reducing the number of states is not enough to eliminate a flip-flop, it still has advantages. With fewer states, you introduce more don't-care conditions into the next-state and output functions, making their implementation simpler. Less logic usually means shorter critical timing paths and a higher clock rate for the system.
More important, today's programmable logic provides limited gate and flip-flop counts on a single programmable logic chip. A typical programmable logic part might have "2000 gate equivalents"
(
rarely
approached in practice)
yet provide only 64 flip-flops! An
important goal of state reduction is to make the implementation "fit"
in as few components as possible. The fewer components you use, the shorter
the design time and the lower the manufacturing cost.State reduction techniques also allow you to be sloppy in obtaining the initial finite state machine description. If you have introduced a few redundant states, you will find and eliminate them by using the state reduction techniques introduced next.
9.2 State Minimization/Reduction
State reduction identifies and combines states that have equivalent "behavior." Two states have equivalent behavior if, for all input combinations, their outputs are the same and they change to the same or equivalent next states.For example, in Figure 9.1
(
b)
,
states S0 and S2 are equivalent. Both states output a
0; both change to S1 on a 1 and self-loop on a 0. Combining these
into a single state leads to Figure 9.1(
a)
. On
all input strings, the output sequence of either state diagram is exactly
the same. Algorithms for state reduction begin with the symbolic state transition table. First, we group together states that have the same state outputs
(
Moore machine)
or transition outputs
(
Mealy machine)
. These are potentially equivalent,
since states cannot be equivalent if their outputs differ. Next, we examine the transitions to see if they go to the same next state for every input combination. If they do, the states are equivalent and we can combine them into a renamed new state. We then change all transitions into the newly combined states. We repeat the process until no additional states can be combined.
In the following two subsections, we examine alternative algorithms for state reduction: row matching and implication charts. The former is a good pencil-and-paper method, but does not always obtain the best reduced state table. Implication charts are more complex to use by hand, but they are easy to implement via computer and do find the best solution.
We can always combine the two approaches. Row matching quickly reduces the number of states. The more complicated implication chart method, now working with fewer states, finds the equivalent states missed by row matching more rapidly.
9.2.1 Row-Matching Method
Let's begin our investigation of the row-matching method with a detailed example. We will see how to transform an initial state diagram for a simple sequence detector into a minimized, equivalent state -diagram.Four-Bit Sequence Detector: Specification and Initial State Diagram Let's consider a sequence-detecting finite state machine with the following specification. The machine has a single input X and output Z. The output is asserted after each 4-bit input sequence if it consists of one of the binary strings 0110 or 1010. The machine returns to the reset state after each 4-bit sequence.
We will assume a Mealy implementation. Some sample behavior of the finite state machine is
=
0010 0110 1100 1010
0011 Z
=
0000 0001 0000 0001 0000 Because this finite state machine recognizes finite length strings, we can place an upper bound on the number of states needed to recognize any particular binary string of length four.
(
hence
the term "row matching")
. For this finite state machine,
we can combine S10 and S12. Let's call the new state S10'
and use it to rename all transitions to S10 or S12. The
revised state table is shown in Figure 9.4.
9.2.2 Implication Chart Method
The implication chart method is a more systematic approach to finding the states that can be combined into a single reduced state. As you might suspect, the method is more complex and is better suited for machine implementation than hand use.(
a)
shows the chart with an entry
for every pair of states. This form of the chart is more complicated than
it needs to be. For example, the diagonal entries are not needed: it does
not reduce states to combine a state with itself! And the upper and lower
triangles of entries are symmetric. The chart entry for Si and
Sj contains the same information as that for Sj and Si.
Thus, we work with the reduced structure of Figure 9.10(
b)
.We fill in the implication chart as follows. Let Xij be the entry whose row is labeled by state Si and whose column is labeled by state Sj. If we were able to combine states Si and Sj, it would imply that their next-state transitions for each input combination must also be equivalent. The chart entry contains the next-state combinations that must be equivalent for the row and column states to be equivalent. If Si and Sj have different outputs or next-state behavior, an X is placed in the entry. This indicates that the two states can never be equivalent.
Three-Bit Sequence Detector: Initial Implication Chart The implication chart for the example state table is shown in Figure 9.11. S0, S1, S2, S3, and S5 have the same outputs and are candidates for being combined. Similarly, states S4 and S6 might also be combined. Any combination of states across the two groups, such as S1 and S4, is labeled by an X in the chart. Since their outputs are different, they can never be equivalent.
To fill in the chart entry for
(
row)
S1 and (
column)
S0, we look
at the next-state transitions. S0 goes to S1 on 0 and
S2 on 1, while S1 goes to S3 and S4,
respectively. We fill the chart in with S1-S3, the transitions
on 0, and S2-S4, the transitions on 1. We call these groupings
implied state pairs. The entry means that S0 and S1
cannot be equivalent unless S1 is equivalent to S3 and
S2 is equivalent to S4. The rest of the entries are filled
in similarly.At this point, the chart already contains enough information to eliminate many impossible equivalent pairs. For example, we already know that S2 and S4 cannot be equivalent: they have different output behavior. Thus there is no way that S0 can be equivalent to S1.
Finding these cases is straightforward. We visit the entries in sequence. For example, start with the top square in the first column and advance from top to bottom and left to right. If square Si,Sj contains the implied state pair Sm-Sn and square Sm,Sn contains an X, then mark Si,Sj with an X as well.
Sequence Detector Example: First Marking Pass Figure 9.12 contains the results of this first marking pass. Entry S2,S0 is marked with an X because the chart entry for the implied state pair S2-S6 is already marked with an X. Entry S3,S0 is also marked, because entry S1,S0
(
as well as S2,S0)
has just been marked. The same is true for S5,S0. By the
end of the pass, the only entries not marked are S2,S1;
S5,S3; and S6,S4.Sequence Detector Example: Second Marking Pass We now make a second pass through the chart to see if we can add any new markings. Entry S2,S1 remains unmarked. Nothing in the chart refutes that S3 and S5 are equivalent. The same is true of S4 and S6.
(
S0)
for all input combinations.
(
these are highlighted
in the figure)
. Their being unmarked implies that S4
is equivalent to S0 (
renamed S0')
and S3 is equivalent to S5 (
S3')
.
The reduced state table is given in Figure 9.17.
Example
State Reduction of Parity Checker Finite
State Machine
The row-matching method could not combine states
S0 and S2 in the three-state parity checker of Figure
9.1. Can the implication chart method do the job?The implication chart for the state transition table of Figure 9.8 is given in Figure 9.18.
Implication Chart Summary The algorithm for state reduction using the implication chart method consists of the following steps:
- Construct the implication chart, consisting of one square for each
possible combination of states taken two at a time.
- For each square labeled by states Si and Sj, if
the outputs of the states differ, mark the square with an X; these states
are not equivalent. Otherwise, they may be equivalent. Within the
square, write implied pairs of next states for all input combinations.
- Systematically advance through the squares of the implication chart.
If the square labeled by states Si,Sj contains an implied
pair Sm-Sn and square Sm,Sn is marked
with an X, then mark Si,Sj with an X. Since Sm
and Sn are not equivalent, neither are Si and Sj.
- Continue executing step 3 until no new squares are marked with an
X.
- For each remaining unmarked square Si,Sj, you can conclude that states Si and Sj are equivalent.
9.3 State Assignment
The number of gates needed to implement a sequential logic network depends strongly on how we assign encoded Boolean values to symbolic states. Unfortunately, the only way to obtain the best possible assignment is to try every choice for the encoding, an extremely large number for real state machines. For example, a four-state finite state machine, such as the traffic light controller of the last chapter, has 4!(
4 factorial)
=
4 * 3 * 2
* 1 =
24 different encodings (
see Figure 9.19)
.
9.3.1 Traffic Light Controller
To illustrate the impact of state encoding on the next-state and output logic, let's use the symbolic state transition table for the traffic light controller, shown in Figure 9.20.=
Green, 01 =
Yellow,
and 10 =
Red.
(
one for each row of Figure 9.21)
.
=
00, HY =
01, FG
=
11, FY =
10 and HG =
00, HY =
10, FG =
01, FY =
11 respectively. A cursory glance
shows that the second encoding uses fewer product terms, eight versus nine,
and fewer literals, 21 versus 26.
With conventional gate logic, the encoding requires 3 five-input gates, 2 four-input gates, 6 three-input gates, and 2 two-input gates, a total of 13 gates. We assume that variables and their complements are available to the network.
In the next two subsections, we present methods for finding good state encodings. These are suitable for pencil and paper, as well as computer-aided design tools.
9.3.2 Pencil-and-Paper Methods
Without computer-aided design tools, there is little you can do to generate a good encoding. Hand enumeration using trial and error becomes tedious even for a relatively small number of states. An n-state finite state machine has n! different encodings. And this is only the lower bound. If the state is not densely encoded in the fewest number of bits, even more encodings are possible.To make the problem more tractable when you must use hand methods, designers have developed a collection of heuristic "guidelines." These try to reduce the distance in Boolean n-space between related states. For example, if state Y is reached by a transition from state X, then the encodings should differ by as few bits as possible. The next-state logic will be minimized if you follow such guidelines. We examine them in this section.
(
a)
is not as good as the one in Figure 9.25(
b)
under this criterion:
Transition | First Assignment Bit Changes | Second Assignment Bit Changes |
---|---|---|
S0 to S1: | 2 | 1 |
S0 to S2: | 3 | 1 |
S1 to S3: | 3 | 1 |
S2 to S3: | 2 | 1 |
S3 to S4: | 1 | 1 |
S4 to S1: | 2 | 2 |
Note how we used the edge adjacency of the state map. This is so we can place S3 between the assignments for S1 and S2, since it is the target of transitions from both of these states.
Finally, we place S4 adjacent to S3, since it is the destination of S3's only transition. It would be perfect if S4 could also be placed distance 1 from S0, but it is not possible to do this and satisfy the other desired adjacencies.
The resulting assignment exhibits only seven bit transitions. There may be many other assignments with the same number of bit transitions, and perhaps an assignment that needs even fewer.
The minimum-bit-change heuristic, although simple, is not likely to achieve the best assignment. For a finite state machine like the traffic light controller, cycling through its regular sequence of states, the minimum transition distance is obtained by a Gray code assignment: HG
=
00, HY =
01, FG =
11, FY =
10. This was the first state assignment we tried in the previous subsection,
and it was not as good as the second assignment, even though the latter
did not involve a minimum number of bit changes.Guidelines Based on Next State and Input/Outputs Although the criterion of minimum transition distance is simple, it suffers by not considering the input and output values in determining the next state. A second set of heuristic guidelines makes an effort to consider this in the assignment of states:
- Highest priority: States with the same next state for a given
input transition should be given adjacent assignments in the state map.
- Medium priority: Next states of the same state should be given adjacent assignments in the state map. Lowest priority: States with the same output for a given input should be given adjacent assignments in the state map.
Example
Applying the Guidelines
Consider
the state transition table of Figure 9.13 for the 3-bit sequence detector.
The corresponding state diagram is shown in Figure 9.27.
The lowest-priority assignments are made for states that have the same output behavior for a given input. S0, S1', and S3' all output 0 when the input is 0. Similarly, S0, S1', S3', and S4' output 0 when the input is 1.
The constraints on the assignments can be summarized as follows:
Highest priority:(
S3', S4')
;
Medium priority:(
S3', S4')
;
- Lowest priority: 0/0:
(
S0, S1',
S3')
; 1/0:
(
S0, S1', S3',
S4')
;
Example
Applying the Guidelines in a More Complicated
Case
As another example, let's consider the more complicated
state diagram of the 4-bit string recognizer of Figure 9.7. Applying the
guidelines yields the following set of assignment constraints:
Highest priority:(
S3', S4')
,(
S7', S10')
;
Medium priority:(
S1, S2)
, 2 ¥(
S3', S4')
,(
S7', S10')
;
- Lowest priority: 0/0:
(
S0, S1,
S2, S3', S4', S7')
;1/0:
(
S0, S1, S2, S3',
S4', S7', S10')
;
(
a)
and
first assign the reset state to the encoding for 0. Since (
S3',
S4')
is both a high-priority and medium-priority
adjacency, we make their assignments next. S3' is assigned
011 and S4' is assigned 111.
(
S7', S10')
next because this pair also appears in the high- and medium-priority lists.
We assign them the encodings 010 and 110, respectively. Besides giving them
adjacent assignments, this places S7 near S0, S3',
and S4', which satisfies some of the lower-priority -adjacencies.The final adjacency is
(
S1, S2)
.
We give them the assignments 001 and 101. This satisfies a medium-priority
placement as well as the lowest--priority placements.The second assignment is shown in Figure 9.29
(
b)
.
We arrived at it by a similar line of reasoning, except that we assigned
S7' and S10' the states 100 and 110.
The second assignment does about as good a job as the first, satisfying
all of the high- and medium-priority guidelines, as well as most of the
lowest-priority ones.Applying the Guidelines: Why They Work The state assignment guidelines attempt to maximize the adjacent groupings of 1's in the next-state and output functions. Let P2, P1, and P0 be the next-state functions, expressed in terms of the current state Q2, Q1, Q0 and the input X. To see how effective the guidelines were, let's compare the assignment of Figure 9.29
(
a)
with a more naive assignment: S0
=
000, S1 =
001, S2 =
010, S3' =
011, S4' =
100, S7' =
101, S10' =
110. Figure 9.30 compares the encoded next-state tables and K-maps for the two encodings. The 1's are nicely clustered in the next state K-maps for the assignment derived from the guidelines. We can implement P2 with three product terms and P1 and P2 with one each.
In the second assignment, the 1's are spread throughout the K-maps, since we made the assignment with no attempt to cluster the 1's usefully. In this implementation, the next-state functions P2, P1, and P0 each require three product terms, with a considerably larger number of literals overall.
9.3.3 One Hot Encodings
So far, our goal has been dense encodings: state encodings in as few bits as possible. An alternative approach introduces additional flip-flops, in the hope of reducing the next-state and output logic.One form of this method is called one hot encoding. A machine with n states is encoded using exactly n flip-flops. Each state is represented by an n-bit binary code in which exactly 1 bit is asserted. This is the origin of the term "one hot."
Let's consider the traffic light finite state machine described earlier in this section. The following would be a possible one hot encoding of the machine's state:
The state is encoded in four flip-flops rather than two, and only 1 bit is asserted in each of the states.HG=
0001 HY=
0010 FG=
0100 FY=
1000
The product terms are all five and six variables, with two to five terms per output. This is rather complex for discrete logic but would not cause problems for a PLA-based design.
9.3.4 Computer Tools: Nova, Mustang, Jedi*
The previous subsections described various heuristic
approaches for obtaining a good state encoding. None are guaranteed to obtain
the best result. In this section, we examine three programs for computer-generated
state assignments. The state assignment guidelines of Section 9.3.2 place related states together in the state map, thereby clustering the 1's in those functions. However, we cannot evaluate an assignment fully without actually minimizing the functions. When it comes to hand techniques, this means the K-map method. If we have to use K-maps to minimize the functions for each possible assignment, we are not likely to examine very many of them!
Tools are available that follow the basic kinds of heuristics described above. Unlike hand methods, they are not limited to six state bits, they can generate many alternative assignments rapidly, and they can evaluate the derived assignments by invoking minimization tools like espresso or misII automatically. Three tools that provide this function are nova, for two-level logic implementations, and mustang and jedi, for multilevel logic. Jedi is somewhat more general; it can be used for -general-purpose symbolic encodings, such as outputs, as well as next states. We examine each of these programs in the following subsections.
Nova: State Assignment for Two-Level Implementation The inputs to nova are similar to the truth table format already described for espresso. The input file consists of the truth table entries for the finite state machine description. The latter are of the form:
inputs current_state next_state outputs
Figure 9.32 shows the format for our example traffic light controller. The states are symbolic; the inputs and outputs are binary encoded. It is important to separate each section with a space. We interpret the first line as "if input 1
(
car sensor)
is unasserted and the state
is HG (
highway green)
, then the next state is
HG (
highway green)
and the outputs are 00010 (
ST
=
0, H1:H0 =
00 "green," F1:F0 =
10 "red")
.
=
00, HY =
11, FG =
01, and FY
=
10. The espresso truth table is included in the
output, indicating that it takes nine product terms to implement the state
machine.Intuitively, the state assignment algorithms used by nova are much like the assignment guidelines of Section 9.3.2. States that are mapped by some input into the same next state and that assert the same output are partitioned into groups. In the terminology of state assignment, these are called input constraints. Nova attempts to assign adjacent encodings within the smallest Boolean cube to states in the same group. A related concept is output constraints. States that are next states of a common predecessor state are given adjacent assignments.
Nova implements a wide range of state-encoding strategies, any of which you can select when you invoke the program:
- Greedy: makes its state assignment based on satisfying as many of
the input constraints as it can. When using a greedy approach, nova
looks only at new assignments that strictly improve on those it has already
examined.
- Hybrid: also makes its state assignment based on satisfying the input
constraints. However, nova will examine some assignments that start
off looking worse, but may eventually yield a better assignment. This yields
better assignments than the greedy strategy.
- I/O Hybrid: similar to hybrid, except it tries to satisfy
input and output constraints. Its results are usually better than hybrid's.
- Exact: obtains the best encoding that satisfies the input constraints.
Since the output constraints are not considered, this is still not the best
possible encoding.
- Input Annealing: similar to hybrid, except it uses a more
sophisticated method for improving the state assignment.
- One-Hot: uses a one-hot encoding. This rarely yields a minimum
product term assignment, but it may dramatically reduce the complexity of
the product terms
(
in other words, the literal count may be greatly reduced)
.
- Random: uses a randomly generated encoding. Nova
will generate the specified number of random assignments. It will report
on the best assignment it has found
(
the one requiring the smallest number of product terms)
.
(
a)
and (
b)
.
.i
, .o
,
and .s
directives. For example, the traffic light input file
is shown in Figure 9.37.
- Random: chooses a random state encoding.
- Sequential: assigns the states binary codes in a sequential
order.
- One-Hot: makes a one-hot state assignment.
- Fan-in: works on the input and fan-in of each state. Next
states that are produced by the same inputs from similar sets of present
states are given adjacent state assignments. The assignment is chosen to
maximize the common subexpressions in the next-state function. This is much
like the high-priority assignment guideline of Section 9.3.2.
- Fan-out: works on the output and fan-out of each state, without regard to inputs. Present states with the same outputs that produce similar sets of next states are given adjacent state assignments. Again, this maximizes the common subexpressions in the output and next-state functions. This approach works well for finite state machines with many outputs but few inputs. It is much like the medium- and low-priority assignment guidelines from Section 9.3.2.
| HG | HY | FG | FY | Number of product terms |
---|---|---|---|---|---|
Random: | 01 | 10 | 11 | 00 | 9 |
Sequential: | 01 | 10 | 11 | 00 | 9 |
Fan-in: | 00 | 01 | 10 | 11 | 8 |
Fan-out: | 10 | 11 | 00 | 01 | 8 |
| S0 | S1 | S3' | S4' | Number of product terms |
---|---|---|---|---|---|
Random: | 01 | 10 | 11 | 00 | 5 |
Sequential: | 01 | 10 | 11 | 00 | 5 |
Fan-in: | 10 | 11 | 00 | 01 | 4 |
Fan-out: | 10 | 11 | 00 | 01 | 4 |
Jedi The final encoding program we shall examine is jedi. It is similar to mustang in that its goal is to obtain a good encoding for a multilevel implementation. It is more powerful than mustang because it can solve general encoding problems: jedi can find good encodings for the outputs as well as the states.
Like the other programs, jedi implements several alternative encoding strategies that can be selected on the command line. Besides random, one hot, and straightforward, the program supports input dom-i-nant, output dominant, modified output dominant, and input/output combination algorithms.
The jedi input format is similar to, but slightly different from, the mustang input format. We can illustrate this best with an example. Figure 9.38 shows the jedi input file for the traffic light controller. The present state and next states each count as a single input, even though they may be encoded by several bits. The
.enum States
line tells jedi that there are four states
and that they should be encoded in 2 bits and then gives the state names.
The .enum Colors
line tells jedi that there are three
output colors and that these should also be encoded in 2 bits. You should
think of these as enumerated types. The .itype
and .otype
lines define the types of the inputs and outputs, respectively.
| HG | HY | FG | FY | Grn | Yel | Red | Number of product terms |
---|---|---|---|---|---|---|---|---|
Input: | 00 | 10 | 11 | 01 | 11 | 01 | 00 | 9 |
Output: | 00 | 01 | 11 | 10 | 10 | 11 | 01 | 9 |
Combination: | 00 | 10 | 11 | 01 | 10 | 00 | 01 | 9 |
Output': | 01 | 00 | 10 | 11 | 10 | 00 | 01 | 10 |
| S0 | S1 | S3' | S4' | Number of product terms |
---|---|---|---|---|---|
Input: | 01 | 00 | 11 | 10 | 4 |
Output: | 11 | 01 | 00 | 10 | 4 |
Combination: | 10 | 00 | 11 | 01 | 4 |
Output': | 11 | 01 | 00 | 10 | 4 |
=
00, HY =
01, FG =
10,
FY =
11, Green =
00, Yellow =
01,
and Red =
10 and the jedi encoding in which HG =
00, HY =
01, FG =
11, FY =
10, Green
=
10, Yellow =
11, and Red =
01.
The first encoding used eight product terms in a two-level implementation;
the second used nine.
No comments:
Post a Comment