Monday, 24 June 2013

NPDA to CFG/CFL

 Given a NPDA  M = ( Q, Sigma, Gamma, delta, q0, Z0, Phi) and
  Sigma intersection Gamma = Phi,
  Construct a CFG  G = ( V, T, P, S )

  Set T = Sigma
      S = S
      V = { S } union { [q,A,p] | q and p in Q and A in Gamma }
          This can be a big set! q is every state with A every Gamma with
          p every state. The cardinality of V, |V|=|Q|x|Gamma|x|Q|
          Note that the symbology [q,A,p] is just a variable name.
          (the states in the NPDA are renamed q0, q1, ... if necessary)

  Construct the productions in two stages, the S -> , then the [q,A,p] ->

  S -> [q0,Z0,qi]  for every qi in Q (including q0)  |Q| of these productions

  [qi,A,qm+1] -> a[qj,B1,q2][q2,B2,q3][q3,B3,q4]...[qm,Bm,qm+1] is created
  for each q2, q3, q4, ..., qm+1 in Q
  for each a in Sigma union {epsilon}
  for each A,B1,B2,B3,...,Bm in Gamma
  such that there is a delta of the form

    delta(qi,a,A) = { ...,(qj,B1B2B3...Bm), ...}

  Note three degenerate cases:

           delta(qi,a,A)=phi  makes no productions

           delta(qi,a,A)={(qj,epsilon)} makes [qi,A,qj] -> a

           delta(qi,epsilon,A)={(qj,epsilon)} makes [qi,A,qj] -> epsilon


  The general case:

  Pictorially, given  delta(qi,a,A)= (qj,B1B2)   generate the set
                            |  | |    |  | |     for qk being every state,
              +-------------+  | |    |  | |     while qm+1 is every state
              |  +---------------+    |  | |
              |  |             |      |  | |
              |  |          +--+      |  | |
              |  |          | +-------+  | |
              |  |          | |  +-------+ |
              |  |          | |  |         |
              V  V          V V  V         V
             [qi,A,qm+1] -> a[qj,B1,qk][qk,B2,qm+1]
                   |                |   ^     ^
                   |                |   |     |
                   |                +---+     |
                   +--------------------------+

  The book suggests to follow the chain of states starting with the right
  sides of the S -> productions, then the new right sides of the [q,a,p] ->
  productions. The correct grammar is built generating all productions.
  Then the "simplification" can be applied to eliminate useless variables,
  eliminate nullable variables, eliminate unit productions, convert to
  Chomsky Normal Form, convert to Greibach Normal Form.

  WOW! Now we have the Greibach Normal Form of a NPDA with any number of
  states and we can convert this to a NPDA with just one state by the
  construction in the previous lecture.

  The important concept is that the constructions CFG to NPDA and NPDA to CFG
  provably keep the same language being accepted. Well, to be technical,
  The language generated by the CFG is exactly the language accepted by
  the NPDA. Fixing up any technical details like renaming Gamma symbols
  if Gamma intersection Sigma not empty and accepting or rejecting
  the null string appropriately.

  The reverse of the example in the previous lecture. |Q|=1 makes it easy.
  Given: NPDA = (Q, Sigma, Gamma, delta, q0, Z0, F)
         Q={q} Sigma={0,1} Gamma={C,S,T} q0=q  Z0=S  F=Phi
         delta(q, 0, C) = (q, CT)
         delta(q, 0, C) = (q, TT)
         delta(q, 0, S) = (q, C)
         delta(q, 0, S) = (q, T)
         delta(q, 1, T) = (q, epsilon)

  Build: G = (V, T, P , S)
         V = { S, qCq, qSq, qTq }  four variable names
         T = Sigma = {0, 1}        (dropping the previous punctuation [,,])
         S = Z0 = S
  S-> productions
         S -> qSq       Note!  qSq is a single variable 
                               just dropped the [,, ] symbols

  other productions:
       delta(q, 0, C)  =  (q, CT)
             |  |  |       |  ||
       +-----+  |  |  +----+  ||
       |+----------+  |+------+|
       ||       |     ||     +-+
       ||       |     ||     |
       qCq  ->  0     qCq   qTq       was  C -> 0 C T
         |              |   | |
         |              +---+ |    
         +--------------------+

  continue using same method on each delta:
      qCq -> 0  qTq  qTq
      qSq -> 0  qCq
      qSq -> 0  qTq
      qTq -> 1                 (epsilon becomes nothing)

  Now, if you prefer, rename the variables to single letters
  (assuming you have a big enough alphabet)
  For this simple example  qCq becomes just C, qSq becomes just S
  and qTq becomes just T, thus the productions become:
      C -> 0  C  T
      C -> 0  T  T
      S -> 0  C
      S -> 0  T
      T -> 1

  This grammar is Greibach normal form for L(G)={0^n 1^n | n<0}

Now, working an example for another NPDA for this same language:

  NPDA  M = ( Q, Sigma, Gamma, delta, q0, Z0, Phi)
  Q = { q0, q1 }
  Sigma = { 0, 1 }
  Gamma = { Z0, V0, V1 }
  delta =
          (q0,0,Z0) = (q0,V0)      pop Z0 write V0 (for zero)
          (q0,0,V0) = (q0,V0V0)    add another V0 to stack
          (q0,1,V0) = (q1,epsilon) pop a V0 for a one
          (q1,1,V0) = (q1,epsilon) pop a V0 for each 1
  accept on empty stack

  Build: G = (V, T, P , S)
         V = { S, see below in productions }  
         T = Sigma = {0, 1} 
         S = Z0 = S

  S-> productions
       S -> [q0,Z0,q0]  (one for each state)               P1
       S -> [q0,Z0,q1]                                     P2

  delta productions
          (q0,0,Z0) = (q0,V0)  (one for each state)

          [q0,Z0,q0] -> 0 [q0,V0,q0]                       P3
                  |               |
                  +---------------+

          [q0,Z0,q1] -> 0 [q0,V0,q1]                       P4
                  |               |
                  +---------------+

          (q0,0,V0) = (q0,V0V0)  (two combinations of two states)


          [q0,V0,q0] -> 0 [q0,V0,q0] [q0,V0,q0]            P5
                  |               |    |     |
                  |               +----+     |
                  +--------------------------+

          [q0,V0,q0] -> 0 [q0,V0,q1] [q1,V0,q0]            P6
                  |               |    |     |
                  |               +----+     |
                  +--------------------------+

          [q0,V0,q1] -> 0 [q0,V0,q0] [q0,V0,q1]            P7
                  |               |    |     |
                  |               +----+     |
                  +--------------------------+

          [q0,V0,q1] -> 0 [q0,V0,q1] [q1,V0,q1]            P8
                  |               |    |     |
                  |               +----+     |
                  +--------------------------+

          (q0,1,V0) = (q1,epsilon)

          [q0,V0,q1] -> 1                                  P9

          (q1,1,V0) = (q1,epsilon)

          [q1,V0,q1] -> 1                                  P10

A a brief check, consider the string from the derivation
    P2, P4, P9  that produces the string  01

    P2, P4, P8, P9, P10  that produces the string  0011