Real Computer Science begins where we almost stop reading ...: The Halting Problem

Monday, 24 June 2013

The Halting Problem

The "Halting Problem" is a very strong, provably correct, statement
  that no one will ever be able to write a computer program or design
  a Turing machine that can determine if an arbitrary program will
  halt (stop, exit) for every input.

  This is NOT saying that some programs or some Turing machines can not
  be analyzed to determine that they, for example, always halt.

  The Halting Problem says that no computer program or Turing machine
  can determine if ALL computer programs or Turing machines will halt
  or not halt on ALL inputs. To prove the Halting Problem is
  unsolvable we will construct one program and one input for which
  there is no computer program or Turing machine that can correctly
  determine if it halts or does not halt.

  We will use very powerful mathematical concepts and do the proofs
  for both a computer program and a Turing machine. The mathematical
  concepts we need are:

  Proof by contradiction. Assume a statement is true, show that the
  assumption leads to a contradiction. Thus the statement is proved false.

  Self referral. Have a computer program or a Turing machine operate
  on itself, well, a copy of itself, as input data. Specifically we
  will use diagonalization, taking the enumeration of Turing machines
  and using TMi as input to TMi.

  Logical negation. Take a black box that accepts input and outputs
  true or false, put that black box in a bigger black box that 
  switches the output so it is false or true respectively.

  The simplest demonstration of how to use these mathematical concepts
  to get an unsolvable problem is to write on the front and back of
  a piece of paper "The statement on the back of this paper is false."
  Starting on side 1, you could choose "True" and thus deduce side 2
  is "False". But staring at side 2, which is exactly the same as
  side 1, you get that side 2 is "True" and thus side 1 is "False."  
  Since side 1, and side 2, can be both "True" and "False" there
  is a contradiction. The problem of determining if sides 1 and 2 are
  "True" of "False" is unsolvable. Sometimes called a paradox.


  The Halting Problem for a programming language. We will use the "C"
  programming language, yet any language will work.

  Assumption: There exists a way to write a function named Halts such that:

    int Halts(char * P, char * I)
    {
      /* code that reads the source code for a "C" program, P,
         determines that P is a legal program, then determines if P
         eventually halts (or exits) when P reads the input string I,
         and finally sets a variable "halt" to 1 if P halts on input I,
         else sets "halt" to 0 */
      return halt;
    }

  Construct a program called Diagonal.c as follows:

  int main()
  {
    char I[100000000];          /* make as big as you want or use malloc */
    read_a_C_program_into( I ); 
    if ( Halts(I,I) )  { while(1){} }   /* loop forever, means does not halt */
    else return 1;
  }

  Compile and link Diagonal.c into the executable program Diagonal.
  Now execute
      Diagonal < Diagonal.c

  Consider two mutually exclusive cases:
  Case 1: Halts(I,I) returns a value 1.
          This means, by the definition of Halts, that Diagonal.c halts
          when given the input Diagonal.c.
          BUT! we are running Diagonal.c (having been compiled and linked)
          and so we see that Halts(I,I) returns a value 1 causes the "if"
          statement to be true and the  "while(1){}" statement to be executed,
          which never halts, thus our executing Diagonal.c does NOT halt.
          This is a contradiction because this case says that Diagonal.c
          does halt when given input Diagonal.c.
          Well, try the other case.

  Case 2: Halts(I,I) returns a value 0.
          This means, by the definition of halts, that Diagonal.c does NOT
          halt when given the input Diagonal.c.
          BUT! we are running Diagonal.c (having been compiled and linked)
          and so we see that Halts(I,I) returns a value 0 causes the "else"
          to be executed and the main function halts (stops, exits).
          This is a contradiction because this case says that Diagonal.c
          does NOT halt when given input Diagonal.c.
          There are no other cases, Halts can only return 1 or 0.
          Thus what must be wrong is our assumption "there exists a way
          to write a function named Halts..."

Every Turing machine can be represented as a unique binary number.
Any method of encoding could be used. We can assume ASCII encoding of sample
      s        0      space   ...    R
as 01110011 00110000 00100000 ... 01010010 just a big binary integer.


  The Halting Problem for Turing machines.

  Assumption: There exists a Turing machine, TMh, such that:
  When the input tape contains the encoding of a Turing machine, TMj followed
  by input data k, TMh accepts if TMj halts with input k and TMh rejects
  if TMj is not a Turing machine or TMj does not halt with input k.

  Note that TMh always halts and either accepts or rejects.
  Pictorially TMh is:


 +---------------------------- 
 | encoded TMj B k BBBBB ... 
 +---------------------------- 
    ^ read and write, move left and right
    |
    |  +-----+
    |  |     |--> accept
    +--+ FSM |                     always halts
       |     |--> reject
       +-----+

  We now use the machine TMh to construct another Turing machine TMi.
  We take the Finite State Machine, FSM, from TMh and
    1) make none of its states be final states
    2) add a non final state ql that on all inputs goes to ql
    3) add a final state qf that is the accepting state

  Pictorially  TMi is:


 +------------------------------------------- 
 | encoded TMj B k BBBBB ... 
 +------------------------------------------- 
    ^ read and write, move left and right
    |
    |  +----------------------------------+
    |  |                      __          |
    |  |                     /   \ 0,1    |
    |  |                   +-| ql |--+    |
    |  | +-----+           | \___/   |    |
    |  | |     |--> accept-+   ^     |    |
    +--+-+ FSM |               |_____|    |            may not halt
       | | TMh |--> reject-+    _         |
       | +-----+           |  // \\       |
       |                   +-||qf ||------|--> accept
       |                      \\_//       |
       +----------------------------------+

  We now have Turing machine TMi operate on a tape that has TMi as
  the input machine and TMi as the input data.



 +------------------------------------------- 
 | encoded TMi B encoded TMi BBBBB ... 
 +------------------------------------------- 
    ^ read and write, move left and right
    |
    |  +----------------------------------+
    |  |                      __          |
    |  |                     /   \ 0,1    |
    |  |                   +-| ql |--+    |
    |  | +-----+           | \___/   |    |
    |  | |     |--> accept-+   ^     |    |
    +--+-+ FSM |               |_____|    |            may not halt
       | |     |--> reject-+    _         |
       | +-----+           |  // \\       |
       |                   +-||qf ||------|--> accept
       |                      \\_//       |
       +----------------------------------+

  Consider two mutually exclusive cases:
  Case 1: The FSM accepts thus TMi enters the state ql.
          This means, by the definition of TMh that TMi halts with input TMi.
          BUT! we are running TMi on input TMi with input TMi
          and so we see that the FSM accepting causes TMi to loop forever
          thus NOT halting.
          This is a contradiction because this case says that TMi
          does halt when given input TMi with input TMi.
          Well, try the other case.

  Case 2: The FSM rejects thus TMi enters the state qf.
          This means, by the definition of TMh that TMi does NOT halt with
          input TMi.
          BUT! we are running TMi on input TMi with input TMi
          and so we see that the FSM rejecting cause TMi to accept and halt.
          This is a contradiction because this case says that TMi
          does NOT halt when given input TMi with input TMi.
          There are no other cases, FSM either accepts or rejects.
          Thus what must be wrong is our assumption "there exists a
          Turing machine, TMh, such that..."
          QED.

  Thus we have proved that no Turing machine TMh can ever be created
  that can be given the encoding of any Turing machine, TMj, and any input, k,
  and always determine if TMj halts on input k.
Real Computer Science begins where we almost stop reading ...

Monday, 24 June 2013

The Halting Problem

No comments:

Post a Comment