Real Computer Science begins where we almost stop reading ...: Linear Programming

Sunday, 23 June 2013

Linear Programming

The intersection of integer programming and linear programming seems the logical place to begin our discussion of integer programming. We know that these are exactly the convex problems which can stated in integer programming terms. Thus if we can find a minimum (or maximum) solution we know that that is a global minimum (or maximum). And, since these are included in the realm of linear programming, we are guaranteed an optimal solution in polynomial time. These problems not only can be solved in polynomial time, but comprise a major portion of the algorithms encountered in computer science. Even though very few are solved in a linear programming context ordinarily, all can be stated in linear programming terms.

For example, consider bipartite graph matching. An example of a bipartite graph (one whose vertices can be partitioned into two nonconnected sets) appears in figure 1b. The bipartite graph matching problem is to find a set of unconnected edges which cover as many of the vertices as possible. If we select the set of edges:

{<a, b>, <c, f>, <e, d>}

then we have covered all of the vertices of the graph. This is a maximal matching for the graph.

Now we shall state the problem in linear programming terms. For each edge <u, v> of the graph we introduce the variable x_uv. If the variable is set to 1 then the edge is part of the matching, and if set to 0, the edge is not in the matching set.

First, we would like to cover as many vertices as possible. This means including as many edges as we are able. We can accomplish this by maximizing the objective function:

z = x_ab + x_ad + x_cd + x_cf + x_ed.

Next we must make sure that no connected edges occur in the matching. Since two edges leave vertex a, we add the constraint:

x_ab + x_ad £ 1

in order to insure that only one of the two edges ends up in the matching. We add similar constraints for vertices c and d. The complete linear program appears as figure 1a.

(a) (b)

Figure 1 - Bipartite Graph Matching

Let us now back up a step and introduce linear programming in a new fashion; a geometric entity in the plane. Consider the line 2x₁ + x₂ = 6 which divides the plane into two halfplanes. If we wish to designate the halfplane above the line we would write 2x₁ + x₂ ³ 6. This halfplane is shown as shaded region of figure 2a. Now add the line -2x₁ + x₂ = -2 to the first line. The area below it is represented by -2x₁ + x₂ ³ -2. The area bounded by these two lines forms the shaded region in figure 2b. Note that it is unbounded to the right.

(a) (b) (c)

Figure 2 - Areas Bounded by Lines

Adding x₁ + 3x₂ = 15 and - x₁ +x₂ = -3 to the first two lines provides a closed perimeter in the plane. This enclosed region is the shaded area pictured figure 2c. Defining this shaded region is done by stating that the area lies above the first line, below the second and third, and above the last line. The four equations (one for each line) below specify exactly what pairs <x₁ , x₂> lie within the region.

2x₁ + x₂ ³ 6
-2x₁ + x₂ £ -2
x₁ + 3x₂ £ 15
- x₁ +x₂ ³ -3
Another area of the plane defined by lines is pictured in figure 3. This area is defined by the constraining equations provided in the chart at the left along with the edges they define.

Figure 3 - Constraints and Feasible Solution Region

As before, the area bounded by the lines (and axes) is precisely the region of pairs <x₁, x₂> which satisfy the constraining equations. This is called the feasible solution region. Finding the largest pair in the feasible solution region is the optimization problem:

maximize z = x₁ + x₂ subject to the conditions:
- x₁ + x₂ £ 3
2x₁ + 3x₂ £ 19
-3x₁ + x₂ ³ -12
where the values of x₁, x₂ ³ 0

A quick glance at the picture in figure 3 tells us that the best solution to the above optimization problem occurs at vertex d, where x₁ takes the value 5 and x₂ is 3.

Now we are ready for two definitions which formally describe this particular class of optimization problems.

Definition. A general linear programming problem may be stated as follows: Given real numbers b₁ , b₂ , ... , b_m , c₁ , c₂ , ... , c_n and a_ij(for i = 1, ... , m and j = 1, ... , n), minimize (or maximize) the objective function:

z(x₁ , x₂ , ... , x_n) = c₁ x₁ + c₂ x₂ + ... + c_nx_n
subject to the conditions

Definition. In a linear programming problem with nonnegativity constraints all of the variables x_i are greater than or equal to zero.

If we could state the optimization as m equations in n unknowns, then maybe we could solve for some of the unknowns by methods such as Gaussian elimination from linear algebra. We now take our problem from figure 3 and rewrite the equations so that they are still valid, but now contain the equals operator. Here is the transformation:

- x₁ + x₂ £ 3 Þ - x₁ + x₂ + y₁= 3

2x₁ + 3x₂ £ 19 Þ 2x₁ + 3x₂ + y₂ = 19

-3x₁ + x₂ ³ -12 Þ -3x₁ + x₂ - y₃ = -12

Note exactly what was done. In the first equation we added y₁ to bring the value of the left hand side up to 3. Thus y₁ takes up the slack and is called a slack variable. In the last equation we subtracted the surplus variable y₃ so that the value (which was over -12) could be equal to -12. This problem is now almost as we wish.

Definition. A linear programming problem is in standard form if and only if all of the x_i and b_j are greater than or equal to zero and the constraints can be stated: Ax = b.

By changing all of the signs in our third equation, we finally arrive at the related standard problem for our previous problem.

maximize z(x₁ , x₂ , y₁ , y₂ , y₃) = x₁ + x₂ subject to the constraints:
- x₁ + x₂ + y₁ = 3
2x₁ + 3x₂ + y₂ = 19
3x₁ - x₂ + y₃ = 12
where the values of x₁ , x₂ , y₁ , y₂ , y₃ ³ 0.

One very important piece of information concerning the relationship between problems and their related standard problems needs to be stressed. Namely:

There is a one-to-one correspondence between a problem's feasible solutions and those for its related standard problem.

In our case, if we omit the slack and surplus variable (y_i) values from any feasible solution to the related standard problem, we have a solution to our original problem.

Now that we have a set of three equations in five unknowns, let's try to solve for x₁ and x₂ by standard linear algebra methods. We do this in the a series of tableaux. Our tableau setup is designed to not only help solve the problem, but impart information as we progress. Compare this one:

x₁

x₂

y₁

y₂

y₃

b_j

1	1	0	0	0	0	-z
-1	1	1	0	0	3	y₁
2	3	0	1	0	19	y₂
3	-1	0	0	1	12	y₃

to the equations above. The columns are assigned to the original, slack, and surplus variables. The first row holds the objective function and for each equation, there is a row holding its coefficients (the a_ij and the b_j's).

Now for some linear algebra. If we can find m independent columns, (one for each constraint or row) then we have a basic solution to our problem. This basic solution comes from expressing the vector b as a linear combination of these columns (which we call a basis). At the moment our problem can be expressed:

Ax + Iy = b

(where I is an m by m identity matrix) and a basic feasible solution for our problem can be found by setting the x_i= 0 and y_j = b_j. This is:

x₁	x₂	y₁	y₂	y₃
0	0	3	19	12

Consult the picture in figure 3 and note that <0, 0> was indeed a feasible solution of the original problem. Now look again at the tableau above and note that the basis variables are indicated on the right.

At this point we would like to move along toward the optimum solution for our problem. Making column one into a unit vector (one 1 and the rest 0's) would mean that we could express b as a linear combination of x₁ and two of the y_j. This is a step forward.

Look in column one. We would like to set the column one entry of one of the rows to 1 and then pivot on it. (By pivoting we mean add or subtract that row from the rest to get 0's in the first column.) We cannot do that in the first row without making the value of b₁ negative (which is not allowed in our definition of standard problem). Using row two would set b₂ to 19/2. This is legal but not the value of x₁ in any feasible solution. So, we are forced to use row three as our pivot.

After pivoting on column one, row three, we produce the tableau:

x₁

x₂

y₁

y₂

y₃

b_j

0	4/3	0	0	-1/3	-4	-z
0	2/3	1	0	1/3	7	y₁
0	11/3	0	1	-2/3	11	y₂
1	-1/3	0	0	1/3	4	x₁

By doing this, we have added x₁ to the basis and removed y₃from it. The feasible solution is now:

x₁	x₂	y₁	y₂	y₃
4	0	7	11	0

which corresponds to the lower right corner of the polygon in figure 3. In the same manner, we select column two to pivot on next (so that x₂ joins the basis). There is a choice between rows one and two. We select row two and produce the tableau:

x₁

x₂

y₁

y₂

y₃

b_j

0	0	0	-4/11	-1/11	-8	-z
0	0	1	-2/11	5/11	5	y₁
0	1	0	3/11	-2/11	3	x₂
1	0	0	1/11	3/11	5	x₁

which provides us with the solution:

x₁	x₂	y₁	y₂	y₃
5	3	5	0	0

Here we halt with the optimum (maximum) feasible solution <5, 3> to our original linear programming problem.

(A note in passing on a topic which we shall return to later. As we pivoted and modified our tableaux until we found an optimal solution, we were traversing the vertices of the polygon in figure 3. In fact, we began at vertex a and went through vertex e so that we could end at vertex d. This geometric interpretation of this method of solving linear programming problems will be examined in a little more detail at the end of this section.)

Several topics need some explanation. The first one concerns just exactly how we choose pivots. The top row of the tableau helps with this since it indicates how much the objective function would change if we added that column to the basis. For example, in the first tableau of the above example we note that if either x₁ or x₂ is added to the basis, then the objective function would increase by the new value of x₁ or x₂. We added x₁= 4 to the basis and the objective function went from 0 to 4. (Note that there is a -4 at the top right.) In the second tableau, the top row informs us that placing x₂ in the tableau will increase the objective function by 4/3 of x₂’snew value. This happened, as it went from 4 to 8 as we went from <4, 0> to a solution of <5, 3>.

Thus selecting a pivot which will increase the value of the objective function is preferable. The way to do this, however, is controversial. A method named steepest descent calls for pivoting on the column with the largest positive entry in the top row. Another method (this one called greatest increment) tells us to pivot on the column which will increase the objective function by the greatest amount. The first is of course the easier in terms of computational time, but the second might just get us our optimal answer sooner. Both methods have devotees.

Now that we have selected a column, what row do we pick? It is easy to find out what row not to pick. First of all, do not select a row that has a nonpositive entry at the selected column. Since all b_j must remain positive, pivoting on a negative entry would ruin this. (If none of the entries in the column are positive, then there is a problem since the top row entry tells us that the solution can be improved and the other entries claim that it cannot. This inconsistency means that the solution is unbounded and thus no optimum exists. Next, do not select a row which will lead to a unfeasible solution. An example here is column 1 row 2 of the first tableau in our last example. Pivoting on it will set x₂to 19/2 which is not feasible. Another problem occurs when pivoting on some row will make one of the b_j entries negative. (And would happen to b₃ if we pivoted on row 2 column 1 in the first tableau.) This is forbidden also in our standard form problem

There is one other subtlety in pivoting. It is possible to cycle if one is unlucky. There are methods for keeping this from taking place, but we shall not investigate them here.

Our last problem concerns the place we start when solving the equations. If we are fortunate and have with a problem in the standard form: Ax + Iy = b then we merely set each x_{FACE="Arial" SIZE=5>i}= 0 and y_j= b_j and take that as our first basic feasible solution. Otherwise we have to do some extra work. We must find a basic feasible solution before we can begin to solve our optimization problem.

Consider our very first example, that of figure 2. After putting it in standard form with slack and surplus variables we have:

maximize z(x₁ , x₂) = 2x₁ + x₂subject to the constraints:
2x₁ + x₂ - y₁ = 6
2x₁ - x₂ - y₂ = 2
x₁ + 3x₂ + y₃ = 15
x₁ - x₂ + y₄ = 3
where all x_i, y_j ³ 0.

This looks good. But, what do we use as our first basic feasible solution? Looking merely at the equations, there are values for the y_j which satisfy them. In fact, something like:

x₁	x₂	y₁	y₂	y₃	y₄
0	0	-6	-2	15	3

ought to be a solution to the problem. But, this of course is not a basic feasible solution because the y_j are negative and this is not allowed in a standard linear programming problem.

We are in trouble. There seems to be no obvious basic feasible solution. Looking at the picture of this problem in figure 2 tells us what happened. Since <0, 0> was not in the feasible solution space, we could not begin there. So, how do we begin with no proper starting tableau?

Well, we just might be able to transform this into another linear programming problem. For example, if we added some extra variables to the first two equations which had the proper signs, then the y_i could be set to zero and there would be a basic feasible solution to the new problem. Consider the related problem:

minimize z(s₁, s₂) = s₁ + s₂subject to the constraints:
2x₁ + x₂ - y₁ + s₁ = 6
2x₁ - x₂ - y₂ + s₂ = 2
x₁ + 3x₂ + y₃ = 15
x₁ - x₂ + y₄= 3
where all x_i, y_j, s_j ³ 0

where the variables s₁ and s₂ have been added. The basic feasible solution to this new problem is:

x₁	x₂	y₁	y₂	y₃	y₄	s₁	s₂
0	0	0	0	15	3	6	2

Minimizing s₁ + s₂ means bringing them down to zero. If we are lucky and can get both of the s_j to be zero and toss them out of the equations. This will be a basic feasible solution to our related standard problem.

Since minimization is just reversing signs and maximizing, we begin with a tableau like that below with -1's on the top row over the s_j for our new objective function.

x₁

x₂

y₁

y₂

y₃

y₄

s₁

s₂

b_j

0	0	0	0	0	0	-1	-1	0	-z
2	1	-1	0	0	0	1	0	6	s₁
2	-1	0	-1	0	0	0	1	2	s₂
1	3	0	0	1	0	0	0	15	y₃
1	-1	0	0	0	1	0	0	3	y₄

The tableau is not in proper form though. We want to have zeros, not negative numbers in the top row over the basis columns. To do this we merely add the first two rows (those in which s₁ and s₂ appear) to the top row. This sets -z = 8 as our beginning point and we now have :

x₁

x₂

y₁

y₂

y₃

y₄

s₁

s₂

b_j

4	0	-1	-1	0	0	0	0	8	-z
2	1	-1	0	0	0	1	0	6	s₁
2	-1	0	-1	0	0	0	1	2	s₂
1	3	0	0	1	0	0	0	15	y₃
1	-1	0	0	0	1	0	0	3	y₄

We wish to get rid of the s_i’s so the first step is to pivot on first column (which looks very promising since there is a 4 in the top row), and second row to add x₁ to the basis and get:

x₁

x₂

y₁

y₂

y₃

y₄

s₁

s₂

b_j

0	2	-1	1	0	0	0	-2	4	-z
0	2	-1	1	0	0	1	-1	4	s₁
1	-1/2	0	-1/2	0	0	0	1/2	1	x₁
0	7/2	0	1/2	1	0	0	-1/2	14	y₃
0	-1/2	0	1/2	0	1	0	-1/2	2	y₄

The second column now looks very attractive and so we select it for our next pivot. After pivoting on the second column, first row we have:

x₁

x₂

y₁

y₂

y₃

y₄

s₁

s₂

b_j

0	0	0	0	0	0	-1	-1	0	-z
0	1	-1/2	1/2	0	0	1/2	-1/2	2	x₂
1	0	1/4	-1/4	0	0	1/4	1/4	2	x₁
0	0	7/4	-5/4	1	0	-7/4	5/4	7	y₃
0	0	1/4	1/4	0	1	-1/4	-1/4	1	y₄

At last the s_j are zero and we know that we have a basic feasible solution to our related standard problem. It is

x₁	x₂	y₁	y₂	y₃	y₄
2	2	0	0	7	1

and the x_i pair <2, 2> is indeed a feasible solution to the original.

Thus linear programming itself provides the method which is used to discover the basic feasible solution needed in order to start solving the related standard problem. It also informs us as to whether or not there are feasible solutions. The algorithm we went through above is named the simplex method and is the standard method for solving linear programming problems.

With the simplex method in hand we can either solve linear programming problems, or detect situations where optimal solutions cannot be found. The procedure is outlined in figure 4.

place the problem in standard form
if there is no basis then [PHASE I]
add an artificial basis of s_j variables
solve problem to minimize sum s_j
if unsuccessful then no solution exists
otherwise discard the s_j variables and restore original objective function
solve problem [PHASE II]
if unable to pivot then problem is unbounded
Figure 4 - Two Phase Linear Programming Solution

We close our discussion of algebraic solutions to the linear programming problem with two fundamental theorems.

Theorem 1. Exactly one of the following situations exists for each linear programming problem.

a) There is no solution,
b) the problem is unbounded, or
c) there is an optimal feasible solution.
Theorem 2. During the simplex algorithm's execution, if the top tableau row indicates that a basic feasible solution cannot be improved by pivoting, then it is optimum.

Now we shall return to geometry in order to provide intuition for linear programming problems and the simplex algorithm. First we move from the plane to n dimensional space. In n dimensions the counterpart of a line is a hyperplane. It is a set of points satisfying an equation such as:

a₁x₁ + ... + a_nx_n = b

A hyperplane divides the space into two halfspaces according to the inequalities:

a₁x₁ + ... + a_nx_n £ b and a₁x₁ + ... + a_nx_n ³ b

Since a halfspace is a convex set, so is the intersection of several halfspaces. If the intersection of a finite number of halfspaces is bounded then it is a polytope, in fact, a convex polytope.

Adding a dimension to the polygon featured in figure 3 shall provide us with a three dimensional polytope. Figure 5 contains it and the halfspaces whose intersection form it.

Figure 5 - Constraints Forming a Polytope

Some more terminology is in order. The outside of a polytope is composed of faces. Three kinds exist. We have vertices, which are faces of dimension zero, edges, which have dimension one, and facets, which are of dimension n-1. A theorem describes this.

Theorem 3. A polytope is the convex hull of its vertices.

This shows us a little of what happens while solving linear programming problems. The basic feasible solutions are just the vertices of the polytope built by intersecting the constraints. And, since the feasible solution area is inside the convex hull of vertices, the optimum solution is always at a vertex.

When we pivot, we are traversing an edge from one vertex (basic feasible solution) to another. When we cannot improve a solution, we have reached an optimum vertex. Since the polytope is convex, this optimum must be a global optimum.

Real Computer Science begins where we almost stop reading ...

Sunday, 23 June 2013

Linear Programming

No comments:

Post a Comment