Examining the geometric
interpretation of integer programming reveals that a problem’s
constraints form a polytope. Optimum solutions for the relaxation of the
problem to linear programming can be found on the convex hull of this
polytope. But unfortunately, optimum integer solutions are often found
inside the polytope rather than on the convex hull. This is why linear
programming does not solve integer programming problems.
Consider the two dimensional optimization problem
shown in figure 1. The feasible solution space is the darker area and we
wish to maximize the sum of the two variables. Integer solutions to the
problem occur at the intersections of the grid edges.
Figure 1 - Local Search Path Through Neighborhoods
Suppose that we
were able somehow to find a feasible solution, perhaps through the use
of a greedy method. Suppose that the solution indicated by the dot
labeled a in figure 1 is such a solution. Let us search the area
directly around this solution (which we call the neighborhood of
the solution) to see if there is a better solution which is not too
different than the one we have. If we find a better solution, then we
continue the search for better solutions. This process is illustrated in
figure 1. The dots are feasible solutions and the large circles are
neighborhoods. As we find better and better solutions, we progress
upwards and to the right on the path:
a ® b ® c ® d ® e,
until an optimal solution (in this case found at point e) is encountered.
This method is entitled local search and calls
for searching a small area around a solution and adopting a better
solution if found. The process halts when no better solutions occur.
This algorithm is illustrated in figure 2.
Figure 2 - A Basic Local Search Algorithm
We shall begin with Closed City Tours as out first example. If t is a tour of n cities, then a 1-change neighborhood is defined as:
N1c(t) = {s | s = t with one city’s position changed}
After forming the
neighborhood, it is searched for a better tour. In figure 3, a city is
moved in going from (a) to (b), resulting in a better tour.
Figure 3 - A Local Change in a Closed Tour
An even better neighborhood definition for closed tours is a class called k-optimal
neighborhoods. These are the result of removing k edges from a tour and
reconnecting the tour. Local search methods using 3-optimal
neighborhoods have proven very effective.
One of the most famous local search algorithms is
Kernighan and Lin's min-cut algorithm for the graph partition problem.
The problem’s formal definition appears below.
Graph Partition. Given a weighted graph G = (V, W). Find disjoint sets of vertices A and B such that AÈ B = V and the sum of the weights of the edges between A and B is minimized.
Let us take a weighted graph G = (V, W) where W[i, j] provides the weight of the edge between vi and vj. In figure 4 we find an example with the vertex set V = {r, s, t, u, v, w}.
Figure 4 - Weighted Graph
First we
partition V into two subsets of the same size which we shall name A and
B. Then we call the sum of the weights of all of the edges between A and
B the cost of the partition. This is denoted Cost(A, B). In our
example, we partition the graph into A = {r, s, t} and B = {u, v, w}.
After adding up the weights of all the edges passing between vertices in
A and B we find that Cost(A, B) = 20.
As noted above, the Min-Cut algorithm strategy begins
with an arbitrary partition of the vertex set V into sets A and B. Then
we attempt to form better partitions by swapping vertices between A and
B until the cost seems to be the best that can be achieved.
To do this we must
examine neighborhoods formed by exchanging pairs of vertices from A and
B. If the partition P is the pair <A, B>, then for two vertices a Î A and b Î B the partition Pab is:
Pab = <A È {b} - {a}, B È {a} - {b}>.
It is formed by swapping the vertices a and b between
partitions. This makes the neighborhood of P = <A, B> the
collection of all such partitions. That is:
N(P) = { Pab | for all a Î A and b Î B}.
Now we need to formulate the change in cost of swapping induces.
Definition. The external cost or gain E(a) of moving vertex a out of the set A is the sum of the weights of all the edges leaving A from a.
Definition. The internal cost or loss I(a) of moving vertex a out of the set A is the sum of the weights of the edges between a and other vertices in A.
Definition. The total cost or difference incurred by moving vertex a out of the set A is: D(a) = E(a) - I(a).
An easy technical lemma follows directly from these definitions.
Lemma. The gain incurred from swapping vertices a Î A and b Î B is: g(a, b) = D(a) + D(b) - 2*W[a, b].
Let’s look at the values of these items using our last graph and the partition of A = {r, s, t} and B = {u, v, w}.
Now that we know the
external and internal costs incurred by swapping vertices across the
partition, here are the costs when pairs are swapped.
From these numbers we can
conclude that swapping r and u seems to be a good idea. So is swapping t
and w. We should probably not wish to swap s and v however. Or at least
not at this time.
Let us apply the local search algorithm. We take the
graph G = (V, W) and divide the vertex set V arbitrarily into sets A =
{r, s, t} and B = {u, v, w}. Then we examine the nine neighborhoods that
result from exchanging pairs of vertices from A and B. Figure 5a
contains difference values for all of the vertices in the table on the
left while the table on the right indicates all of the gains involved in
swapping vertices between A and B.
a) Swap r and u, total gain = 6
b) Best swap: s and v, total gain = 6
Figure 5 - Tables used during Min-Cut Local Search
Swapping either
r and u or t and w both provide a gain of 6. We elect to exchange r and
u. Then we recalculate the differences and gains for all vertices.
These new values appear in figure 5b. At this point it seems that the
only swap which is not destructive involves s and v. The local search
terminates at this point because there is no gain.
It is intriguing to wonder if swapping s and v leads
to a neighborhood where better partitions reside. As we see in figure 6,
this is not to be.
Best new swap: t and r, total gain = -1
Figure 6 - More Min-Cut Local Search Tables
The idea is
worth following and Kernighan and Lin did so. They argued that
partitions such as that above are often local minima and continuing to
swap vertices, even though there is negative gain might lead to
neighborhoods with better partitions. Consider the graph in figure 7
that illustrates how total gain might possibly change over time.
Figure 7 - Total Gain over Time
After swaps
one, three, and eight, local maxima occur in the total gain function.
The local search algorithm would halt at any of these. What we would
like to do is continue to swap and possibly reach the global maximum at
step eight. Kernighan and Lin’s variable depth method of local search provides this capability.
We shall now examine sequences of swaps. The process
begins as before by searching the neighborhood of the partition for the
best pair of vertices to exchange. These are swapped even if this
reduces the total gain. In our example the sequence looks like this:
To prevent cycling, we
only allow swapping a vertex once in a sequence of exchanges. Thus,
after r and u are swapped in the above sequence, they become unavailable
for exchanging. Vertex swapping is continued until A and B have been
totally interchanged. We then retain all swaps up to the point of
maximum gain in the sequence. In our example, this means retaining only
the first swap. At this point we begin again with another sequence of
swaps and continue swapping sequences until no gain takes place. The
second swapping sequence is:
Since no gain took place,
we halt and present {u, s, t} and {r, v, w} as our partition. Figure 8
provides a description of the entire algorithm.
Figure 8 - Graph Partitioning Algorithm
Several
implementation features are of value since they make the algorithm run a
bit faster. For example, the recalculation of D[u] for all u Î V that are still available just requires a small update, not an entire one. If ai and bk have been swapped and u Î A then:
D[u] := D[u] + 2* (W[u, ai] - W[u, bk])
Finding the pair to
swap is much quicker also if the values of D[a] and D[b] have been
sorted. Also, we need not do n/2 swaps since the last one merely
completes the entire interchange of A and B and changes the total gain
to zero.
Now let us turn our attention to the analysis of this
algorithm. Three things must be examined: correctness, complexity, and
possibility of reaching an optimum solution. We shall take the latter
first.
Theorem 1. Using the basic operations of the MinCut algorithm it is possible to reach an optimal solution.
Proof.
The MinCut algorithm swaps vertices. Thus we need to show that by
swapping vertices it is possible to go from any partition to an optimal
one, that is, one with the minimum cut.
Suppose we are given an arbitrary partition P and an optimal one POPT. A brief examination of P reveals that it either is the same as POPT
or it is not. If not, then there is a vertex in each half of P that
does not belong. Swap them and continue until the desired partition
appears.
For this problem, correctness is rather simple. As
there are no guarantees, we only need show that a feasible solution has
been found. Thus we may state the following with very little proof.
Theorem 2. The MinCut algorithm is correct.
Proof.
We begin with a feasible solution. Swapping two vertices provides
another feasible solution. Thus at every stage of the algorithm we have a
feasible solution.
With approximation algorithms, correctness is not as
much of an issue as with optimum algorithms. However, the complexity of
the procedure is of great importance to us and must be examined.
Computing the differences D[u] initially is an O(n2) operation since all of the graph edges need to be examined. Setting the vertices as available requires only linear time.
Inside the for loop, selection of the pair to be swapped can be O(n2). Recalculating each D[u] takes a constant amount of time for each available u, so this too is O(n).
Empirical results indicate that the repeat loop will be repeated less
than four times, even for very large values of n. These results indicate
a time complexity of O(n3) for the algorithm.
No comments:
Post a Comment