Tuesday, 2 July 2013

Linked Lists, Recursion


Pointers

The concept of pointers is a central theme in data structures. Java implements pointers implicitly through reference-type variables. Recall that all non-primitive variables in Java are references to objects. For example, the statement

Point coord1 = new Point(25, 50);

declares and instantiates a new Point object. The variable coord1 is a reference (i.e., implicit pointer) to the object.
The above illustration is an example of pointer diagramming notation, which we shall use extensively throughout the course. This notation only shows the instance variables in an object and not the methods.

If we now declare and instantiate another Point object, then we will have two reference variables and two objects, as shown below.

Point coord2 = new Point();
coord2.x = coord1.y;
coord2.y = coord1.x;

Assigning a reference variable to refer to another merely causes the reference value to point to the target object.
coord1 = coord2;
Note that the above operation does not create a new Point object. The original object (25,50) is no longer referenced by any variable and will eventually be reclaimed by the garbage collector.


Linked Lists

  • Store a collection of elements non-contiguously.
  • Allow insertion or deletion of items anywhere in the collection with only a constant (i.e., Q (1)) amount of data movement.
  • Are dynamic data structures whose size can be arbitrarily modified. In contrast, arrays are static data structures whose size cannot be changed.
  • Can be either ordered (i.e., positioned according to ascending or descending order) or unordered.
  • Can be used to implement many abstract data types:
  • stacks
  • queues
  • deques
  • sequences
  • priority queues
  • others
Linked-List Node
A linked list is constructed from a collection of self-referential nodes which may be defined as follows:
class SLNode {
private Object element; // element stored in this node
private SLNode next; // reference to the next node in the list
:
}

SLNode constructors and methods

A SLNode object can be created to point to the next object in the list.
public SLNode(Object e, SLNode n) { //# create a node given element and next element = e;
next = n;
}

An alternative constructor has only one argument for the item; the next field is set to null.
public SLNode(Object element)
{
this.element = element;
this.next = null; //optional
}

An update method allows modification of the next field.
public void setNext(SLNode newNext)
{
next = newNext;
}
Another update method allows modification of the element field.
public void setElement(Object newElem) { element = newElem; }
An access method returns the value of the next reference.
public SLNode getNext() { return next; }
Another access method returns the object stored in the element field.


public Object getElement() { return element; } Building a linked list from SLNode objects
SLNode myNode = new SLNode("MD");

myNode.setNext(new ListNode("VA"));

myNode.getNext().setNext(new ListNode("DC"));
We could have built the same list with a single command.
ListNode myNode = new ListNode("MD", new ListNode("VA", new ListNode("DC")));


Linked-List Animation


Abstract Data Types (ADTs)
  • An abstract data type (ADT) is a collection of data and the particular operations allowed on them.
  • An ADT is considered abstract because the operations performed on it are separated from the underlying implementation.
  • Most data-structure ADTs can be implemented using either a contiguous (array-based) or linked-data representation.
  • An ADT hides the implementation of a data structure behind a well defined interface.



Scope
Today we concentrate on
  • Understanding what a linked list is
  • Fundamental operations on a linked list
  • Defining the SimpleList ADT using a Java interface
  • Implementing the SimpleList ADT using a linked list
Topics to be covered later in the course
  • A more comprehensive "Sequence" ADT.
  • Detailed performance comparison of linked-data versus contiguous-data representations of the Sequence and other ADTs.
  • Functional operations: what is a list used for.
  • Ordering of the list items (sorted? unsorted?)
  • How to efficiently search a list.
  • How to efficiently sort a list.
Consider Consequence of "Obvious" Implementation

Removal from head
tmp <--- head
head <--- head.getNext()
elem <--- tmp.getElement()
tmp.setNext(null)
return elem

Insertion at head
newNode.setNext(head)
head <--- newNode


Insertion at tail

tail.setNext(newNode)
tail <--- tail.getNext()
tail.setNext(null)

Deletion from tail
Can’t be done without incurring O(n) operations.
The Need for a Sentinel Node
Consider inserting at the position prior to the current node.

We would either have to traverse the list up to current or create a new temporary node in which the contents of current are copied. The latter solution is O(1) so we can opt for it.

tmp.setElement(current.getElement());
tmp.setNext(current.getNext());
current.setElement(newNode.getElement());
current.setNext(tmp);
But what if we wanted to delete the last node?

In this case, there is no way to avoid traversing the entire list, since we need to make a change to the node storing "D".
So we adopt the convention that "precursor" points to the node prior to current. The insertion point is in front of current and beyond precursor.
But this new convention leads to problems when removing the first element in the list, inserting in front of the first element in the list, and lists of one or fewer elements.

The solution is to use a sentinel, or header node.

Linked-list implementation of SimpleList ADT
Store header, tail, and current references to point to front of list, end of list, and current position, respectively.

public class LinkedSimpleList implements SimpleList
{
// Instance variables:
SLNode header, precursor, tail;
:::
Two features of the implementation are subtle and require elaboration. The precursor reference actually points to the node prior to the target node, rather than the target node itself. This use of precursor avoids expensive deletions at the end of the list (O(n) if current points to target node). All data movements are Q (1). The other feature is the use of an empty header or sentinel node. When precursor points to the previous node, insertions and deletions at the front of the list become an annoying special case. The sentinel node avoids this problem by providing a previous node for all cases. We shall see other examples of sentinel nodes later. The initial state of the linked list is empty.

public LinkedSimpleList()
{
header = new SLNode();
precursor = tail = header;
}
Inserting and appending nodes to a linked list
We insert new nodes at the position between precursor and precursor.next. Insertion is thus a simple splice and can be applied to any position.



public void insert(Object newElem) throws BoundaryViolationException {
if (precursor == null )
throw new BoundaryViolationException( "Linked list insertion error" );
else
{
SLNode newNode = new SLNode(newElem, precursor.getNext());
precursor.setNext(newNode);
if (precursor == tail)
tail = precursor.getNext();
}
}
The append() method appends a new SLNode to the end of the linked list, i.e., at tail.next.

public void append(Object newElem)
{
SLNode newNode = new SLNode(newElem);
tail.setNext(newNode);
tail = newNode;
}
Removing and retrieving items
Basic deletion is a bypass in the linked list. We assume that precursor.next points to the node to be removed.



//------------------------------------------------------------
// Removes the SLNode at the current position, i.e., at
// precursor.next. Returns the object that was stored in the node
// that was removed.
public Object remove() throws BoundaryViolationException
{
if(!isInList())
{
throw new BoundaryViolationException(
"Linked list error removing current node");
}
else
{
Object retElem = precursor.getNext().getElement();
if(tail == precursor.getNext()) tail = precursor;
precursor.setNext(precursor.getNext().getNext());
return retElem;
}
}



Methods for class LinkedSimpleList

//------------------------------------------------------------
// Returns the object stored at the current position, precursor.next.
public Object getCur() throws BoundaryViolationException
{
if (!isInList())
{
throw new BoundaryViolationException(
"Error getting current elem");
}
else
{
Object retElem = precursor.getNext().getElement();
return retElem;
}
}
//------------------------------------------------------------
// Copies the formal parameter elem to the SLNode at the current
// position, precursor.next. Throws a BoundaryViolationException if
// the current position does not refer to a non-null node.
public void setCur(Object elem) throws BoundaryViolationException
{
if (!isInList())
{
throw new BoundaryViolationException(
"Error getting current elem");
}
else
{
precursor.getNext().setElement(elem);
}
}
//------------------------------------------------------------
// Resets the linked list to the initial state.
public void clear()
{
header.setNext(null);
precursor = tail = header;
}


//------------------------------------------------------------
// Sets the precursor position to point to the first node in the list.
// Since precursor points to the node prior to the current node,
// it is set to point to the heaser node.
public void setFirst()
{
precursor = header;
}


//------------------------------------------------------------
// Assigns the current position to the last node in the list. As such,
// precursor is set to point to the node prior to tail.
public void setLast()
{
if(!isEmpty())
{
if(!isInList())
precursor = header;
while(precursor.getNext() != tail) advance();
}
}

//------------------------------------------------------------
// Sets precursor to point to tail. Afterwards, isInList will return
// false because the current position, precursor.next, will be null.
// This method may be useful for the purpose of appending new nodes
// to the list in conjunction with the insert method. However, this
// method should probably be avoided, since a separate append method
// has been provided.
public void setTail()
{
precursor = tail;
}
//------------------------------------------------------------
// Advances the precursor position by one elem. Throws a
// BoundaryViolationException if the current position is outside
// the boundaries of the list.
public void advance() throws BoundaryViolationException
{
if(!isInList())
{
throw new BoundaryViolationException("Error advancing linked list");
}
else
precursor = precursor.getNext();
}
//------------------------------------------------------------
// Returns the number of elems in the list.
// Is this method efficient?
public int length()
{
SLNode tmp;
int count = 0;
tmp = header.getNext();
while(tmp != null)
{
count++;
tmp = tmp.getNext();
}
return count;
}
//------------------------------------------------------------
// Returns true is the list has no elements, true if it has at least
// one element.
public boolean isEmpty()
{
return (header.getNext() == null);
}


//------------------------------------------------------------
// Returns true if the current position is a valid position in the
// list.
public boolean isInList()
{
return ((precursor != null) && (precursor.getNext() != null));
}

// Prints all objects in the list to standard output.
public void print()
{
for(setFirst();isInList();advance())
System.out.println(getCur());
}

Linked lists – discussion
  • We could have chosen many other implementations of a linked list.

  • The header reference could have pointed to the actual first node on the list.

  • A current node could have pointed to the actual target node (as opposed to using a precursor node).

  • We can also implement the SimpleList ADT using an array. One disadvantage of an array is that arbitrary insertions and deletions may require O(n) data movement operations, which are expensive in practice.
  • We can also implement the SimpleList ADT using a doubly linked list.
  • The important thing to remember is that the user of the software should not have to be concerned with the implementation you choose.

Recursion


  • Recursive methods in Java
  • Recursion and mathematical induction
  • Dangers of recursion
    • infinite regress
    • redundant calculations
  • Mutual recursion
  • Divide-and-conquer recursion
  • The "Towers of Hanoi" puzzle

What is recursion?





A recursive definition is one that is defined in terms of itself. It must include a base case so that the recursion will eventually end.
Recursion in mathematics
Many mathematical functions have recursive definitions. Consider the factorial (!) operation on non-negative integers.
To illustrate,


Recursive methods in Java
Java methods may be recursively defined; that is, a Java method may contain one or more calls to itself. The programmer should be aware that every method call, recursive or otherwise, has certain overhead operations associated with it. When a method is invoked, information about that method (such as its actual parameters, local variables, and return value) are placed on the Java Virtual Machine method stack, also known as the Java stack, virtual stack or activation record. As such, recursive methods are often more expensive than their iterative equivalents. Recursion is nonetheless a useful programming tool and is encountered frequently in data structures and algorithms. Later in the course, we shall encounter highly effective algorithms for searching and sorting that make use of recursion.
Recursive factorial method

int factorial(int n)
{
  if(n < 1) // base case
    return 1;
  else
    return n * factorial(n-1); // recursive call
}
Iterative factorial method int factorial(int n)
{
  int result = 1;
  for(int m = n; m > 0; m--)
    result *= m;
  return result;
}


Recursive Call Trace, Factorial Method
Call traces help us understand the recursive process and analyze the complexity of recursive algorithms. Since each node of the factorial method’s call trace is O(1), the recursive invocation of factorial(n) is O(n).

 
 
Recursion and mathematical induction
Recall proofs by induction
    1. Show that the theorem is true for the smallest case. This can usually be done by inspection.

    2.  
          ===> basis
    3. Show that if the theorem is true for the basis cases, it can be extended to include the next case. If the theorem is true for the case k=n-1, show that it is true for the case k=n.

===> Inductive hypothesis assumes that the theorem is true for the case k=n-1. Similarity with recursive programming
    1. The base case of a recursive method is the case that can be solved without a recursive call.
    2. For the general (non-base) case, k=n, we assume that the recursive method works for k=n-1.
Recursion is the programming equivalent of induction.

Dangers of recursion: infinite regress
Consider the following poorly designed implementation of the factorial method:

int factorial(int n)
{
    System.out.println("factorial("+n+") called");
    if(n == 1)
        return 1;
    else
        return n * factorial(n-1);
}
What if we called this method with an argument of 0? We would see the following console output. factorial(0) called
factorial(-1) called
factorial(-2) called
:
factorial(-9999) called
… and so on, until a stack overflow occurs. This phenomenon is known as infinite regress and is a real danger of recursive programming. To avoid infinite regress, follow these two rules.
  1. Take special care to ensure that the base case is properly implemented.
  2. Make sure that each subsequent recursive call represents progress toward the base case.

Dangers of recursion: redundant calculations
Consider the Fibonacci number sequence, defined as
                The sequence: {0,1,1,2,3,5,8,13,...}
where n is a non-negative integer. A method to generate Fibonacci numbers can be implemented recursively.

public static int fib(int n) {
if(n <= 1) return n; //base case
else return fib(n-1) + fib(n-2);
}
Consider now the call trace for n=5.
Note the redundant calculations: 3 calls to fib(0), 5 calls to fib(1), and so on. Each recursive call does more and more redundant work, resulting in exponential growth.

Mutual recursion
So far we have only considered recursive methods that call themselves. Another type of recursion involves methods that cyclically call each other. This is known as cyclical or mutual recursion. In the following example, methods A and B are mutually recursive.
 
 
void A(int n)

  if (n <= 0) return;
  n--;
  B(n);
}
 
  void B(int n)
{
  if (n <= 0) returnl
  n--;
  A(n);
}



 
 

Divide-and-conquer recursion




One of the more effective uses of recursion is through a class of algorithms known as divide and conquer.

Divide – Smaller problems (subproblems) are solved recursively (except, of course, for the base case). Conquer – The solution to the original problem is then formed from solutions to the subproblems.
  • Divide-and-conquer algorithms generally contain at least two recursive calls in the method.
  • The smaller recursive subproblems must be disjoint (i.e., non-overlapping) in order for the approach to be considered divide-and-conquer.
The following is a divide-and-conquer method for computing the sum of squares of all integers between m and n, i.e.,

int ssq(int m, int n)
{
  if(m > n) return –1; // error condition
  else if (m == n) return m*m;
  else
  {
    int middle = (m+n)/2;
    return ssq(m,middle) + ssq(middle+1,n);
  }
}


Recursion Call Trace for ssq Method


  • The two numbers inside the nodes are m and n.
  • The number above a node is its return value.

Towers of Hanoi
At a remote temple somewhere in Asia, a group of monks is working to move 64 disks from the leftmost peg (peg A) to the rightmost peg (peg C). After all 64 disks have been moved, the universe will dissolve! When will this happen if each disk takes 1 second to move?
Rules of the puzzle:
    • Larger disks may not be placed on top of smaller disks.
    • Only one disk may be moved at a time.
10-disk illustration


Towers of Hanoi (2)
Consider the n=4 subproblem.
Step 1: move 3 disks from peg A to peg B


Towers of Hanoi (3)
Step 2: move 1 disk from peg A to peg C
Step 3: move 3 disks from peg B to peg C


Towers of Hanoi (4)
 public static void tower(int n, char start, char finish, char spare)
{
  if (n<=1)
    System.out.println("Moving a disk from peg " + start +" to peg " +
                      finish);
  else
  {
    tower(n-1,start,spare,finish);
    System.out.println("Moving a disk from peg " + start +" to peg " +
                        finish);
   tower(n-1,spare,finish,start);
  }
}
 
Note that the solution to this problem is a divide-and-conquer algorithm. Calling the above method with n=4, i.e., tower(4,'A','C','B') yields the following result.

Moving a disk from peg A to peg B
Moving a disk from peg A to peg C
Moving a disk from peg B to peg C
Moving a disk from peg A to peg B
Moving a disk from peg C to peg A
Moving a disk from peg C to peg B
Moving a disk from peg A to peg B
Moving a disk from peg A to peg C
Moving a disk from peg B to peg C
Moving a disk from peg B to peg A
Moving a disk from peg C to peg A
Moving a disk from peg B to peg C
Moving a disk from peg A to peg B
Moving a disk from peg A to peg C
Moving a disk from peg B to peg C


Towers of Hanoi (5)
Call trace for n=4:
numbers inside nodes are n, start, finish

 
Number of moves = 1 + 2 + 4 + 8.
For general n, number of moves = .
===> O(2n) complexity
For n=64, number of moves = 264-1 or 1.844 x 1019.
Recall that one move takes one second.
===> Time left in universe = 1.844 x 1019 seconds = 5.85 x 1011 years.
Astronomers estimate the present age of the universe to be 20 billion (2 x 1010) years.