Real Computer Science begins where we almost stop reading ...: Bounds for Heuristics

Sunday, 23 June 2013

Bounds for Heuristics

Whenever we opt for a quick algorithm that will find an approximate solution to a problem, we hope that the solution will be as close to optimum as possible. It would be even better if we could guarantee the solution to be within a certain distance from the optimum solution. That is, given an instance of a problem, we wish the objective function value of the solution provided by the approximation algorithm to be as close to the optimum solution as possible. Being able to bound this closeness is better yet.

Thus, we would like to find a relationship between the algorithm and the optimum solution. For an algorithm A and input (or instance) I, we shall call the value of the objective function for the solution provided by the algorithm A(I). Let us denote the value of the objective function for the optimum solution OPT(I). If we are looking at minimization problems, then we wish to find a g(n) such that for all instances of the problem:

A(I) £ g(OPT(I)).

Consider bin packing. Our input is n items {u₁, u₂, .. , u_n} of size s(u_i) between 0 and 1. We wish to minimize the number of bins needed to pack the items with the constraint that each bin is of size 1. We shall use an algorithm named first fit as our first example. It is a very simple greedy algorithm that works as follows. First, line up the bins in a row from left to right. Then we merely go through our collection of items placing each in the leftmost bin that has room for it. Figure 1 illustrates this for the collection of items with sizes

{1/3, 3/4, 1/4, 2/3, 3/8, 1/4}.

Figure 1 - First Fit Bin Packing

This required three bins. If we were to think about it, we know that we will need at least one bin and no more than n bins. If FF(I) is the objective value for the first fit algorithm on I then:

1 £ FF(I) £ n

But we can refine these bounds a little more. Suppose the items were liquid and we could pour them into the bins. Then we would have to have enough bins to take care of the sum of the sizes. That is our new lower bound. In addition, we should note the following:

Fact. No more than one bin can be half-full or less. And, that one must be the rightmost in the row.

Proof. Suppose that two were half full. In this case, to place items in the rightmost one of these we had to pass over the other half full bin. This goes against the rules for the first fit algorithm.

This fact provides the upper bound for the number of bins: the number we would have if we poured each a tiny bit more than half full. Thus:

The lower bound in the above equation is the least that our optimal solution can possibly be. Putting this all together we derive that

FF(I) £ 2* OPT(I).

A rather exotic and long analysis of first fit does provide a tighter bound. In fact, it can be shown that:

For our next example, let us turn to the Closed Tour problem. It possesses a rather famous algorithm with a provable optimality bound.

First, one constructs the minimum spanning tree (MST) for the collection of cities. Then all of the edges in the minimum spanning tree are duplicated. An example appears as the left portion of figure 2.

Figure 2 - Double MST and Extracted Tour

At this point every vertex has an even number of edges leading from it, so a tour which traverses all of the edges such as:

a - b - e - c - d - c - e - h - e - g - f - g - e - b - a

is possible and can be easily generated. This well-known type of tour is called an Euler tour.

Since an optimal closed tour can be no shorter than the minimum spanning tree, so we know that the Euler tour is no worse than twice the optimum closed tour that visits all of the cities once.

At this point we merely extract a closed tour such as that on the right in figure 2 from the Euler tour and are assured that:

OPT £ tour £ 2* OPT.

The complete algorithm appears as figure 3 below.

Figure 3 - Closed Tour Algorithm

An even better method is to add just enough edges to the spanning tree so that each vertex has even degree. Then, following all of the steps in the above algorithm provides a tour with the following bounds.

Another bounding result due to Hwang that depends upon minimum spanning trees concerns rectilinear Steiner spanning trees.

Theorem (Hwang). The shortest rectilinear Steiner spanning tree over a set of points is no less than two-thirds the size of the rectilinear minimal spanning tree over the points.

Thus if one can show that an algorithm produces a tree no larger than the minimum spanning tree, then it is no worse than one and a half times the optimum.

Real Computer Science begins where we almost stop reading ...

Sunday, 23 June 2013

Bounds for Heuristics

No comments:

Post a Comment