Sunday, 23 June 2013

Intelligent Solution Space Enumeration

As we saw in our examination of linear integer programming, the solution space for an optimization problem can be successively partitioned until each of the portions are bounded by optimal integer solutions. It is even possible to discontinue some of the search paths when it is known that a solution no better than that already found is forthcoming. This suggests a rather famous, yet quite similar method for solving NP-complete problems.
We wish to examine methods of solution space enumeration not based upon geometry, but upon the actual integer solutions themselves. Consider the chromatic number problem for the graph pictured below in figure 1.
Figure 1 - A Graph
It is rather obvious that it can be colored with three colors if nodes b and c are the same color. Let us examine all possible combinations of three colors that can be applied to the nodes of the graph. Since each node can be any of the three colors, there are exactly 34 = 81 colorings, but we can easily reduce this to 33 = 27 if we specify that node a is to be a particular color. Let us set our colors as blue, yellow, and white. We shall color node a blue. In figure 2 all of the coloring combinations are specified for three colors and the remaining graph vertices.
A cursory examination reveals that only a few of these are feasible solutions. Those in the tree on the left (with node b set to blue) are not feasible since node a was set to that color, node b is adjacent to it, and both cannot be the same color.
Figure 2 - Coloring Combinations
In the other trees, we need not consider any portion of the subtree with node c set to blue since node c is also adjacent to node a. The same is true when coloring node d.
Continuing on and deleting unfeasible solutions from the search space, we can prune the search tree until we arrive at the search tree of figure 3. Note that there are exactly two feasible solutions when node a is originally colored blue.
Figure 3 - Intelligent Feasible Solution Tree
Looking even closer at our search tree, it is certain that if we were to search the tree in a depth-first manner for a three-color solution we might only traverse the leftmost branches of the tree in figure 3. Describing the nodes of the tree as quadruples of colors (blue, yellow, and white); we would examine the sequence of partial solutions:
<b, ?, ?, ?>, <b, y, ?, ?>, <b, y, y, ?>, <b, y, y, w>
before finding a feasible solution. Thus by intelligent pruning of the feasible solution tree we may find a solution without looking at the entire set of solutions.
Slightly different reasoning could also be applied to the problem. At each level of the tree we might note that:
  • a node must not be the same color as any adjacent node that is already colored, and
  • a node need only be colored with one of the colors already assigned or the next color not assigned yet.
Looking back at the original search tree, we now know we need not examine combinations where node b is blue. This also cuts down the size of the feasible solution space that we must examine.
Several things should be noted at this point. We could have solved the optimization problem for the chromatic number problem in graphs by making sure that all portions of the search tree that contained one and two color solutions were considered. In addition, we made our decisions of what must be examined based upon partial solutions. This is often the case and we need not enumerate all of the full solutions in order to rule out many cases of optimization problems.
This is an example of a very famous algorithmic technique named branch and bound. Branching takes place since the solution space is modeled as a tree and a search of the tree is performed. Bounding takes place when a subtree is eliminated because it is infeasible or none of the solutions in the subtree are better than the best found so far.
Here, in figure 4, is the branch and bound algorithm for the chromatic number problem that we have just developed.
Figure 4 - Chromatic Number Solution Space Search
Let us now examine the knapsack problem, in particular, the 0-1 knapsack problem. For our example we have three items weighing:
31, 26, 15, and 7 pounds,
and wish to fill a knapsack that can contain 49 pounds. First, we note that this problem is very similar to those depicted in the material on enumeration of solution spaces for integer programming. A depth-first search was used successfully there, so we shall use one here as well.
Our strategy is somewhat greedy. We first examine all of the feasible solutions containing the 31 pound item, then those containing the 26 pound item but not the 31 pound one, and so forth. Figure 5 contains such a depth-first search tree for this 0-1 knapsack problem. Note that unfeasible solutions are blue and actual solutions are yellow.
Figure 5 - Depth-First Search Tree for 0-1 Knapsack Problem
This search tree also bears some similarity to the coloring problem in that we do not continue a path when there cannot be a feasible solution below a node. On the leftmost path through the search tree of figure 5 the capacity was exceeded, so no additional combinations containing the 31 and 26 pound items were examined.
An important fact about the search tree for the knapsack problem emerges at this point.
The knapsack weight at any node is a lower bound for the knapsack weights in that node’s subtree.
Thus, if the knapsack capacity has been exceeded at some node of the search tree, the subtree below that node need not be searched. Our knapsack limit is 49 pounds, so we cease our search at the blue node labeled ‘31+26’ as this sums to 57 pounds.
The search tree provides even more information than the lower bounds that reveal when the knapsack overflows or is about to overflow. We can also easily compute upper bounds on the knapsack weight in the subtree for each node based upon the sum of the possible weights that could be added to the load. Consider the tree in figure 6 that now has labels on some of its nodes.
Figure 6 - Search Tree with Upper Bounds
The numbers to the left of some of the nodes provide upper bounds on the knapsack loads in their subtrees. Thus there is a 79 at the root since that is the sum of all the weights and that, of course, is the largest load that one might attempt to place in the knapsack. At the node containing the 31 pound weight there is a 79 upper bound for the same reason. Examining the load below (with the 31 and 15 pound weights) we find that only the 7 pound weight can be added and so the maximum load for this subtree is 53. At the node containing the 26 pound weight, the 15 and 7 pound weights could be added to the knapsack, so the upper bound for this subtree is 48.
When we reach the node containing the weight of 15 pounds we find that the maximum knapsack load in its subtree is 22 pounds and realize that we need not search that subtree since we have already encountered a better solution, namely 26+15+7 = 48 pounds.
Thus we estimated the best possible solution for each subtree by adding up all of the weights that could be added to the knapsack, and if they did not provide a weight greater than that encountered so far in the search, the subtree was not visited.
We now have two rules for not visiting subtrees based upon the bounds that were computed at each node.
  • If the lower bound is greater than the knapsack capacity, then no feasible solution exists in the subtree.
  • If the upper bound is less than the best solution found thus far, then an optimum solution is not in the subtree.
By traversing the solution space tree, we are BRANCHING to new solutions and we compute a BOUND at each node that helps limit our search. For this reason, these algorithms have been called branch and bound algorithms.
Let us develop a branch and bound algorithm for the knapsack problem. Let the set W = {w1, ... , wn} be the weights of objects to be placed in the knapsack and let the set of variables X = {x1, ... , xn} indicate which objects are to be placed in the knapsack. (That is, if xi = 1, then the i-th object is in the knapsack.) This algorithm is described in figure 7.
Figure 7 - Depth-First Branch and Bound Knapsack Algorithm
Let us represent the knapsack content as the vector <x1, ... , x4> and perform a quick partial trace of the algorithm. We begin with nothing in the knapsack and set best = 0. Then we proceed to <1,0,0,0> and declare 31 to be the best so far. Continuing to <1,1,0,0> we find that we have exceeded the knapsack limit and backtrack. This brings us to <1,0,1,0> and 46 becomes our new best effort. After another overflow at the leaf <1,0,1,1>, we backtrack and find that the upper bound for <1,0,0,1> will not be better than 46, so we immediately backtrack all the way to the root. Next we set x1 = 0 and try the subtree rooted at <0,1,0,0>. This brings us a best load of 48. Insufficient upper bounds prevent examining any more nodes of the subtree.
If we omit the subtrees that the Pack procedure does not visit from our previous search trees, we find that the algorithm traverses the search tree shown in figure 8. As before, the nodes where the capacity was exceeded are darker and the subtree upper bounds (the sum + uk values from the algorithm) have been placed to the left of each node which is not a leaf.
Figure 8 - Search Tree for 0-1 Knapsack Algorithm
Let us turn our attention to another NP-complete problem, that of finding a minimum rectilinear Steiner spanning tree. Here is the formal definition of the problem.
Minimum Rectilinear Steiner Spanning Tree. Given a set of points in the plane, find a set of vertical and horizontal lines of minimal length that span the points.
Our example input data for this problem is the set of points on a unit grid pictured in Figure 9a. Figure 9b is an ordinary minimum spanning tree while figure 9c shows the special spanning tree made up of vertical and horizontal lines that is called a rectilinear Steiner spanning tree. If we use a rectilinear metric to measure the edges in the minimum spanning tree, its length is 13 while the Steiner version measures 12.
Figure 9 - Points, an MST, and Steiner Tree
Before finding the optimum rectilinear Steiner spanning tree (or Steiner tree in the sequel) for this problem, we mention a fact about minimum Steiner trees that will help us with our search.
There is a minimum Steiner tree containing only L-shaped edges between points that can be drawn so that exactly one line passes through each point.
The Steiner tree in figure 9c was constructed from the set of L-shaped edges {ad, bd, cd} by drawing one line through each point. Note that if we were to draw the same tree with a horizontal line passing through point a and one line through each point, the resulting tree would have measured 19. Note also that the tree with the set of edges {ac, bc, cd} is the same size as that of figure 9c.
Again we shall do a depth-first search, adding edges to the tree as we go in a manner reminiscent of Prim's algorithm for minimum spanning trees. To aid our endeavors, we order the edge list by length in hopes that a small tree will surface quickly. If the grid in figure 9 is made up of unit squares, note that the edges are of rectilinear length:
We initially examine the smallest edge (bc), and place its most central vertex (c) in the tree. At this point we do a depth-first search on the edges from c: bc, cd, and ac. Note that we have ordered them by length with the smallest first. The resulting search tree appears as figure 10.
Figure 10 - Depth-First Search Tree for Steiner Tree Problem
At the node on the left labeled bc the tree contains the set of vertices {b, c} and the edge bc. Below this we add the edges leading from b and c to the remaining nodes a and d, and we again visit them in order by length. At the node on the left labeled cd, the tree contains the vertices {b, c, d} and the edges {bc, cd}.
At each vertex of the search tree, we check to see if the spanning tree that is being build is larger than the best found so far. As an initial upper bound on the spanning tree size we use the size of the minimal spanning tree over the points.
Observing the grid that is induced by the points reveals that any rectilinear spanning tree must be at least as big and the grid width plus the grid length. In addition, a theorem by Hwang that states that the smallest rectilinear Steiner spanning tree is no smaller than two-thirds the size of the minimal spanning tree over the points. Thus we may use the largest of these as a lower bound on the size of the best solution and cut off the search if this is achieved.
The algorithms for this process are presented in figures 11 and 12.
Figure 11 - Steiner Tree Main Program
Figure 12 - Depth-First Search Algorithm for Steiner Trees
In our example from figure 9, half of the perimeter is 11 while the minimum spanning tree is 13. Two-thirds of this is 8.7 or 9. So, we begin with a best tree size of 13 and can cut off our search if we find one of size less than 11.