Real Computer Science begins where we almost stop reading ...: Sorting

Tuesday, 2 July 2013

Sorting

Review of Sorting Algorithms

Selection Sort uses a priority queue P implemented with an unsorted sequence:

Phase 1: the insertion of an item into P takes O(1) time; overall O(n)
Phase 2: removing an item takes time proportional to the number of elements k in P, Q(k); overall Q(n²)
Total time complexity: Q(n²)

Insertion Sort is performed on a priority queue implemented as a sorted sequence:

Phase 1: the first insertItem takes O(1), the second O(2), until the last insertItem takes O(n); overall O(n²).
Phase 2: removing an item takes O(1) time; overall O(n).
Total time complexity: O(n²)

Heap Sort uses a priority queue implemented as a heap.

insertItem and removeMinElement each take O(log k) where k is the number of elements in the heap at a given time.
Phase 1: n elements inserted: O(n log n) time.
Phase 2: n elements removed: O(n log n) time.
Total time complexity: O(n log n)

Divide and Conquer

Divide and conquer is more than just a military strategy, it is also a method of algorithm design that has led to efficient algorithms such as merge-sort and quick-sort.

This algorithmic approach has three distinct steps:

Divide: If the input size is too large to deal with in a trivial (or nearly trivial) manner, divide the data into two or more disjoint subsets.

Recurse: Use the same divide-and-conquer algorithm to solve the subproblems associated with the data subsets.

Conquer: Take the solutions to these now-solved subproblems and "merge" them into the solution for the original problems.

Merge-Sort

Algorithm:

Divide: If S has at least two elements (nothing needs to be done if S has zero or one elements), remove all the elements from S and put them into two subsequences, S₁ and S₂, each containing about half of the elements of S.

S₁ contains the first én/2ù elements,
S₂ contains the remaining ën/2û elements.

Recurse: Recursively sort subsequences S₁ and S₂.

Conquer: Put back the elements into S by merging the now-sorted subsequences S₁ and S₂ into a sorted final sequence S.

Merge-Sort Tree

A conceptual tool to visualize the execution of merge sort.
Not a data structure.

Take a binary tree T.

Each node of T represents a recursive call of the merge-sort algorithm.

We associate with each node v of T the set of input passed to the invocation that v represents.

The children of node v are associated with the recursive calls on the subsequences of that node.

The external nodes represent base cases and are associated with individual elements of S, on which no further recursion is required.

Merge-Sort Example

Merge-Sort Example (cont.)

Merge-Sort Example (cont.)

Merge-Sort Example (cont.)

Merge-Sort Example (cont.)

Merge-Sort Example (cont.)

Merge-Sort Example (cont.)

Merge-Sort Example (cont.)

Merge-Sort Example (cont.)

Merge-Sort Example (cont.)

Merge-Sort Example (cont.)

Merging Two Sequences

Pseudo-Code for the "conquer" step

Algorithmmerge(S₁, S₂, S)

Input:

Sequences S₁ and S₂, on whose elements a total order relation is defined and which is sorted in nonedecreasing order, and an empty sequence S.

Output: Sequence S containing the union of the elements from S₁ and S₂ sorted in nondecreasing order; sequences S₁ and S₂ become empty as a result of the execution of this algorithm.
while

S₁ is not empty and S₂ is not empty do

if S₁.first().element()£S₂.first().element() then

// move the first element of S₁ to the end of S

S.insertLast(S₁.remove(S₁.first()))

else

// move the last element of S₂ to the end of S

S.insertLast(S₂.remove(S₂.first()))

// move the remaining elements (if any) of S₁ to S

while S₁ is not empty do

S.insertLast(S₁.remove(S₁.first()))

// move the remaining elements (if any) of S₂ to S

while S₂ is not empty do

S.insertLast(S₂.remove(S₂.first()))

Merge Example

Merge Example (cont.)

Merge Example (cont.)

Merge Example (cont.)

Merge Example (cont.)

Analysis of Merge-Sort

Assume that n is a power of 2. The same result can be shown for general n.

Proposition 1: The merge-sort tree associated with the execution of merge-sort on a sequence of n elements has a height of .

Informal justification: If n is a power of 2, log n equals the number of times that a sequence of size n can be divided in half.

Analysis of Merge-Sort (cont).

Proposition 2: A merge-sort algorithm sorts a sequence of size n in O(n log n) time.

Justification:

Assume that access, insertion, and deletion operations on the first elements of S, S1, and S2 are O(1).
When considering the time spent at node v in the recursive tree, only consider the time spent at that node exclusive of the time spent at node v’s children. Then the time spent at node v is proportional to the size of its sequence.
Let i be the depth of node v. The size of node v’s sequence is n/2ⁱ Þ The cost of node v is O(n/2ⁱ).
T has exactly 2ⁱ nodes at depth i. The time spent in all nodes at depth i is then O(2ⁱn/2ⁱ) = O(n).
Since the tree has O(log n) height, the total cost of merge-sort is O(n log n).

Interface SortObject

public interface SortObject {
// sort sequence S in nondecreasing order using comparator c
publicvoid sort(Sequence S, Comparator c);
}

Interface SortObject

Class ListMergeSort

Class ListMergeSort

public class ListMergeSort implements SortObject {
    publicvoid sort(Sequence S, Comparator c) {
        int n = S.size();
        // a sequence with 0 or 1 element is already sorted
        if (n < 2) return;
        // divide
        Sequence S1 = (Sequence)S.newContainer();
        for (int i=1; i <= (n+1)/2; i++) {
            S1.insertLast(S.remove(S.first()));
        }
        Sequence S2 = (Sequence)S.newContainer();
        for (int i=1; i <= n/2; i++) {
            S2.insertLast(S.remove(S.first()));
        }
        // recur
        sort(S1,c);
        sort(S2,c);
        //conquer
        merge(S1,S2,c,S);
    }

Class ListMergeSort (cont.)

    public void merge(Sequence S1, Sequence S2, Comparator c, Sequence S) {
        while(!S1.isEmpty() && !S2.isEmpty()) {
            if(c.isLessThanOrEqualTo(S1.first().element(),S2.first().element())) {
                S.insertLast(S1.remove(S1.first()));
            }
            else {
                S.insertLast(S2.remove(S2.first()));
            }
        }
        if(S1.isEmpty()) {
            while(!S2.isEmpty()) {
                S.insertLast(S2.remove(S2.first()));
            }
        }
        if(S2.isEmpty()) {
            while(!S1.isEmpty()) {
                S.insertLast(S1.remove(S1.first()));
            }
        }
    }
}

Lafore MergeSort Applet

Quick-Sort

Quick-sort

is another divide-and-conquer sorting algorithm,

can be performed "in-place" meaning that only a constant amount of memory is needed in addition to the memory required for the objects being sorted, and
is probably the most commonly used sorting algorithm.

High-level description:

1) Divide: If the sequence S has two or more elements, select an element x from S to be the pivot. Any arbitrary element such as the last element, will do. Subdivide S into 3 subsequences:

L, holds the elements of S £x.
E, the pivot element (single-element sequence)
G, holds the elements of S ³x.

2) Recurse: Recursively execute the quick-sort algorithm on L and G. 3) Conquer: A simple concatenation of L + E + G.

Quick-Sort Tree

Quick-Sort Tree (cont.)

Quick-Sort Tree (cont.)

Quick-Sort Tree (cont.)

Quick-Sort Tree (cont.)

Quick-Sort Tree (cont.)

Quick-Sort Tree (cont.)

Quick-Sort Tree (cont.)

Quick-Sort Tree (cont.)

Quick-Sort Divide Step

In-place quick-sort: The in-place algorithm uses array indices l and r to partitions the array-based sequence into subsequences L, E, G.

l scans the sequence from the left, and r from the right.

A swap is performed when l is at an element larger than the pivot and r is at one smaller than the pivot.

Quick-Sort Partition Step (cont.)

A final swap with the pivot completes the divide (i.e., partition) step.

ArrayQuickSort.java
Lafore Quick-Sort Applet

Analysis of Quick-Sort

Consider a quick-sort tree T:

Let s_i(n) denote the sum of the input sizes of the nodes at depth i in T.

We know that s₀(n) = n since the root of T is associated with the entire input set.

Also, s₁(n) = n-1, since the pivot is not propagated.

Thus, either s₂(n) = n – 3, or, if one of the subsequences at level 1 has zero input size, s₂(n) = n - 2 in the worst case.
In general, s_i(n) = n – i in the worst case.
The worst-case running time of quick-sort is then:

The worst-case running time of quick-sort is O(n²). This occurs when the input sequence is sorted. Nearly worst-case performance occurs on nearly sorted sequences.

Analysis of Quick-Sort (cont.)

Now to look at the best-case running time of quick-sort:
We can see that quick-sort behaves optimally if, whenever a sequence S is divided into subsequences L and G, the subsequences are of equal size.
More precisely,

The height of the tree is the maximum value of i for which

That is to say, the height of the quick-sort tree in the best case is h = log n, and the best-case performance of quick-sort is O(n log n).

Randomized Quick-Sort

The main drawback to quick-sort is that it achieves its worst-case time complexity on data sets that are common in practice, namely sequences that are already sorted (or mostly sorted).

To avoid this, modifications to quick-sort are available that are more careful about choosing the pivot.

One such modification chooses a random element of the sequence. This modified version is known as randomized quick-sort.

The expected time of a randomized quick-sort on a sequence of size n is O(n log n).

Justification: see next page.

Average-Case Analysis of Quick-Sort

Cost of quick-sort on n items = partition step + cost of 2 recursive calls

T(n) = (an+b)+ T_L(n) + T_G(n)

Assume that the size of any subsequence from 0 to n-1 is equally likely.

which implies that

Subtract the above two equations

Rearrange and drop insignificant constant terms

Average-Case Analysis of Quick-Sort (cont.)

Divide by n(n+1)

Solve the recursive relationship

where H_n is the n^th Harmonic number which can be bounded above and below by

Thus

Selection and Order Statistics

Order Statistic - Identify a single element relative to its rank in the collection.

example: median, element that would be at rank

in a sorted collection.

Selection problem – select the kth smallest element from a collection of n similar elements.

Approaches to the selection problem

Sort the collection:

using comparison-based sorting algorithm.

Randomized quick-select algorithm

O(n²) worst-case performance, but

O(n) expected performance

Similar to randomized quick-sort. Chooses an arbitrary element from the collection as the pivot and recursively calls itself.

Randomized Quick-Select

Algorithm quickSelect(S,k):

Input: Sequence S of n comparable elements Input: integer
Output: The kth smallest element of S if n = 1 then
return the (first) element of S
Pick a random integer r in the range [0,n-1]
Let x be the element of S at rank r
Remove all the elements from S and put them into three sequences:

L, storing the elements in S less than x

E, storing the elements in S equal to x

G, storing the elements in S greater than x

if k £L.size()then
    quickSelect(L,k)
else if k £L.size() + E.size() then
    // each element in E is equal to x
    return x
else
    // note the new selection parameter
    quickSelect(G,k-L.size()-E.size())

No comments:

Inspiring Quotes

An inspiring quote may be just what you need to turn your day around. Here are some of the most inspiring quotes ever spoken or written.

I hated every minute of training, but I said, “Don’t quit. Suffer now and live the rest of your life as a champion.”

–Muhammad Ali

“You can have anything you want if you are willing to give up the belief that you can’t have it.”
–Robert Anthony

“There is no man living that can not do more than he thinks he can.”

–Henry Ford

“The best way to predict the future is to create it.”

–Dr. Forrest C. Shaklee

“It’s not about time, it’s about choices. How are you spending your choices?”

–Beverly Adamo

“Success…seems to be connected with action. Successful people keep moving. They make mistakes, but they don’t quit.”
–Conrad Hilton

“Destiny is not a matter of chance; it’s a matter of choice.”

–Anonymous

“The future belongs to those who believe in the beauty of their dreams.”
–Eleanor Roosevelt

“The quality of a person’s life is in direct proportion to their commitment to excellence, regardless of their chosen field of endeavor.”
–Vince Lombardi

“It is never too late to be what you might have been.”
–George Eliot

“Do not let what you can not do; interfere with what you can do.”
–John Wooden

“One man with courage makes a majority.”
–Andrew Jackson

“Failure is the opportunity to begin again more intelligently.”
–Henry Ford

“Try not to become a man of success but rather try to become a man of value.”
–Albert Einstein

“The mind is its own place, and in itself can make a heaven of Hell, a hell of Heaven.”

–John Milton

"If u are student, working and preparing give a little extra effort after regular work. A small sacrifice of TV time, fun time, or facebook time can bring a lot of better things to life than you ever imagined."

-- Naam likhna jaroori nai samajhta.

Thank you for reading, be sure to pass this along!

Tuesday, 2 July 2013

Sorting

No comments:

Post a Comment