Sunday, 23 June 2013

Regular, CFG, Recursive languages

Identities of regular expression
1) R*R* = R*
2) (R*)* = R*
3) RR* = R*R
4)(PQ)*P =P(QP)*
5) (a+b)* = (a*b*)* = (a*+b*)* = (a+b*)* = a*(ba*)*
 example simplify ((a*b*)* (b*a*)*)*
   = ((a+b)*(b+a)*)*
   = ((a+b)*(a+b)*)*  //set union is commutative that is a+b = b+a
   = (a+b)*                  //using above 1) and 2)
6) ∅ + R = R
7) Rε = R
8) ∅* = ε

The empty set  ∅ = {} is not the same as the empty string ε. So {}* = {ε} != {}.

Example simplify  ∅* + a* + b* + (a + b)*
since  ∅* ={ε} and ε subset of (a + b)*
a* and b* are also subset of (a + b)*
i.e a* + (a + b)* =(a + b)*
so its (a+b)*


Properties of Regular expressions
1. Associativity
associative over both Union and Concatenation
a+(b+c)=(a+b)+c  and (ab)c =a(bc)

 2)Commutative
over union only
a+b = b+a  but ab != ba

3)Distributive 
Concatenation over union
 both left and right distributive
 a . (b + c) = a.b + a.c  and also (a+b).c = a.c+b.c
Union over concatenation
 not distributive that is b+(a.c) != (b+a ). (b+c)

4)Identity
R+∅ = R  // for union
R.ε =R  // for concatenation

5)Annihilator
For concatenation  ∅ is annihilator
R.∅ = ∅
For union no annihilator

Practice problems
http://www.cs.utexas.edu/~cline/ear/automata/CS341-Fall-2004-Packet/2-Homework/


Finite Automata
  • with output
    • Moore m/c
    • Mealy m/c
  • without output
    • Deterministic for each input there is one and only one state to which automata can transition from current state
    • Non deterministic automata can be in several state at once
    • epsilon NFA

Deterministric Automata
Five tuple A = (Q,Σ,δ,q,F)
1. A finite set of states, often denoted Q
2. A finite set of input symbols, often denoted Σ
3. A transition function that takes as arguments a state and an input symbol and returns a state
The transition function is commonly denoted δ
δ(q,a) =p
4. A start state, one of the states in Q
5. A set of final or accepting states F (F ⊆ Q)

Note In transition table the new state on any input is single unlike NFA where its set of states


Non deterministic Automata
  • can be several states at once
  • each NFA accepts language that is accepted by some DFA 
  • easier than DFA
  • we can convert NFA to DFA
  • Transition function δ(q,a)  returns set of states {p1,p2,p3..}
DFA has almost as many states as NFA but has more transitions
In worst case DFA can have 2^n states when correponding NFA has n states

A NFA is allowed to make transition without getting any inputs called epsilon transition, this does not affects the class of language accepted but make implementation easier

For every DFA  there is equivalent RE
For every RE there is equivalent eNFA

DFA ---------->  RE
State Elimination Method
Eliminates states of the automaton and replaces the edges with regular expressions that includes the behavior of the eliminated states.
Eventually we get down to the situation with just a start and final node, and this is easy to express as a RE

1)If more than one final state make it single final state with all final going to this new

2)If the two states are different, we will have an automaton that looks like the following:

Note
when eliminating the state we take care to loop for all the possible loops that go through that state


Closure properties of regular language
we say language L is closed under operation X if the output of X is in L whenever inputs are in L.
Regular languages are closed under

1)Union and Intersection
cross-product of the two DFA will recognize union and intersection of two languages

2)Set complement
if we make non final states as final and final states are non final states, we get a DFA accepting the complement of language

3)String Reversal
pick NFA recognizing the languag. create new final state with epsilon transitions from old state then swap the start and final states and change directions on each transition we have a NFA accepting the reversed string

4)Set difference
can be rewritten in form of intersection and compliment

6)Concatenation and star

5)Homomorphism (substitution of string for symbols)
It is function from string ---> string  . Its output on multicharacter string is its output on individual characters in string
h(xy) =h(x)h(y) for any string x and y
Since we can always write regular expression for any homomorphism of regular language its closed under homomorphism

6)Inverse Homomorphism


Regular language not closed under
1)Subset and Superset operations


Homomorphism
consider symbols of language L as {a,b} and Homomorphism H symbol set as {a,b,c}
h(a) =ab ,h(b) =bbc then
h(aba)=h(a)h(b)h(c) = abbbcab
The homomorphic image of language L1 ={aa,aba} is h(L1) ={abab,abbbcab}

Decision properties of Regular Language
1)Membership
Is string w in regular language L?
If string w is accepted in the FA of language L then its member

2)Emptiness
Is L = ∅ ?
Remove all unreachable states and edges from the FA for the language L and then check if it has any final(accepting) state. Use graph reachability algorithm to check.

3)Finiteness
Is L finite language?
A regular language is not necessarily finite
DFA method
Remove states that are not reachable from the start state and all states that donot reach an accepting state from the DFA
If there are any cycles in the remaining DFA then it is infinite else finite
Regular Expression Method
If there are any * in the RE then it should be infinite but there are exceptions like
0ε*1 is not infinite nor is 0*∅  so we need to eliminate ∅ and ε equivalent expressions from RE
Remember
E+F is equivalent to ∅ iff both E and F are equivalent to ∅
EF is equivalent to ∅  if either E or F is
ε is not equivalent to ∅ neither is E*

E+F is equivalent to ε iff both E and F are equivalent to ε
EF is equivalent to ε  if both E and F are
E*  is equivalent to ε iff E is 

4)Equivalence
in product automaton LxM
final state pair for which exactly one final state of q and r
the language accepted should be empty for equivalence

5)Containment
Is L subset of M?
in product automata LxM
make those pair as final state for which only L is final and M is not
the language accepted should be empty for containment

make

Non Regular Language
Not recognized by finite automata
Finite automata cannot distinguish infinite string because they have limited memory

Myhill-Nerode's theorem
Indistinguishable strings
String x and y in Σ* are indistinguishable with respect to language L if and only if every string z like xz, yz are both in L or both are not in L
1)choose z
2)append z to strings
3)and check new string  whether they are in language or not
example  Check whether a and aa are indistinguishable with respect to language {an} over alphabet a
1)we choose string z in Σ* to be ak ,
2)check string of form aakand aak, for every k this strings are in language
3)so this strings are indistinguishable

Theorem
Theorem : A language L over alphabet Σ is nonregular if and only if there is an infinite subset of  Σ* , whose strings are pairwise distinguishable with respect to L.

Example Let L1 = { anbn | n is a positive integer } over alphabet { a , b } show its nonregular using Myhill-Nerode
1)Consider the set of strings S1 = { an | n is a positive integer } . S1 is over alphabet { a , b } and it is infinite. We will check for its strings and try to show that they are pairwise distinguishable with respect to L1.
 2)Let x= ak and y= am be two strings of the set S1, where k,m >0 and k != m .
3)Select z= bm as a string to be appended to ak and a .
4)we see that akbm is not into L1 while ambm is in L1 .
Hence ak and am are distinguishable with respect to L1 . Since ak and am are arbitrary strings of S1, S1 satisfies the conditions of Myhill-Nerode theorem. Hence L1 is nonregular.

Pumping Lemma : Suppose that a language L is regular. Then there exist integer n > =1  such that for any string x in L with |x| >=  n(pumping length), there are strings u, v and w which satisfy the following relationships:
            x = uvw
            |uv| <= n     // loop appears within first n letters where n is pumping length
            |v| > 0   and 
            for every integer m >= 0, uvmw element of L.   // v is the repeated string or one that loops

  • Pumming lemma follows from simple pigeonhole rule that if  number of symbols is at least the size of string in the language then atleast one state must be repeated.
  • pumping length n is the number of states in the FA
  • v is the string that is repeated.choosing v to be of at least one letter assures that loop appears within first p letters
  • so we chose a string x of length at least n  and try to prove that there is some repeated state

Pumping lemma gives necessity condition for language to be regular but its not the sufficient condition i.e even if we prove pumming lemma for language L its not necessarily  regular language

Context Free Grammar

Closure Properties
1)Union
Is L1 U L2 a CFL?
closed under union
by adding a new rule for starting symbol like S->S1|S2

2)Concatenation and closure
closed under both
for concatenation add S1->S2
for kleen closure add S->S1|S|e

3)Homomorphism
closed under homomorphism

4)Substitution
Homomorphism is special case of substitution
closed under substitution

5)Intersection
not closed under intersection

6)Intersection with Regular language
L1 CFL and L2 regular then L1 intersection L2 is CFL

7) Complementation

not closed as would mean intersection is closed which is not
¬(¬L1 ∪ ¬L2) = L1 ∩ L2 

8)Set difference
not closed under set difference

8) Set difference with Regular language
L1 CFL L2 Regular then
L1-L2 =L1 intersection L2'
which is CFL so its closed under CFL

9)Inverse Homomorphism
closed under inverse homomorphism

Decision problems of CFG
Decidable
1)Emptiness

2)finiteness

3)membership

Undecidable
1)equivalence L1 and L2 CFG then is L1=L2?
2)Inclusion or containment 
can L1 generate all string generated by L2?
3)is CFG for language ambigous
4)Universality
Does it generate the language of all strings over the alphabet of terminal symbols used in rules
Is L(G) = Σ*?
5)given context sensitive grammer does it describe CFG or
6)given CFG does it describle Regular language



If grammar G is in CNF any derivation of string w has 2n-1 steps for |w|=n



Problem  DFA recognizing L = {x ∈ Σ∗ : 1110 is not a prefix of x}.
Solution

CYK algorithm
//incomplete for now
To check if membership property of CFL

when blank inthe table ?
when S inthe table ?


division for substring of size 3  aab can be
aa  b
a ab
which can be solved by looking at level 2 and level 1 only

for substring of size 4
division can be
a aba
aab a
which can be solved by looking at level 1 and 3
aa ba
which can be solved by looking at level 2

for substring of level 5
division ca be
a abaa
aaba a
 looking at level 1 and 4
aab aa
aa baa
level 2 and 3


Pushdown Automata
like NFA with stack(unlimited memory assumed), stack provide additional memory which help recognize some non regular languages
NPDA equivalent in power to CFG
PDA can write symbols(push) on the stack and read them(pop) back
LIFO
DPDA and NPDA are not equal in power. Expressive power of NDPD more than DPDA
NPDA can recognize languages not recognized by DPDA  unlike DF and NF in regular grammar which accepts same set of languages

Theorem
A language is context free if and only if there is some pushdown automata that recognizes it and also
If PDA accepts any language then its CFL

Theorem
L1 be set of languages accepted by PDA by empty stack L2 be set of languages accepted by PDA by final state then L1=L2

Fig. state transition in PDA


Difference between NPDA and PDA
language like L1= {wwR|w (a+b)*}  and L2 = {aibjck | i=j or j=k}
needs non determinism to recognize them
In Language L1 at each point we need to nondeterministically guess middle of string and in L2 we need two state on transition from a to b from one state we check for c also by popping and on other we check for b by popping
L1 can be converted such that its accepted by DPDA by including some middle marker in it.


Non Context Free Languages
L3 = {ww| we(a+b)*} is not accepted by PDA because the here we need FIFO but stack is LIFO
while complement of L3 is CFL

L4 = ({aibjck | i,j,k= n }
Its not accepted by PDA because we can either match a with a or c. It requires two memory elements
Complement of L4 is also CFL

L5 = {aibjckdm | i,j,k= n }

Check for Non CFG

1) denote each symbol power in form of variable i j k ..
 If language L6= {anbncm | n,m >=1} convert it to {aibjckdm | i=j=n,k= m, i,j,k >=1}
check if more than one comparision required (more than one memory element required) in variables i, j, k..
 for above language we have only one comparision i=j so its CFL
For language L7 = {ambncmdn  m,n>=1} we convert it to {abjckdl   i=k, j=l i,j,k,l >= 1}
we see there are two comparisons so its not a CFL

2)Check if FIFO is required if so not a CFL
example L= {ww}

3)If language is over single alphabet then it behaves similar to Regular grammar, that is, It is CFL only if the length of string(power of letter) is in arithematic progression
example L8={an2 | n>=1} and  L9={an! |n>=1} are not CFL



Recursive and Recursively Enumerable
A language L is recursively enumerable(re) or Turing Acceptable if there exists a TM M such that L=L(M) or some TM accepts the language L.For a string not in language it may halt or may loop forever unlike recursive language
A language L is recursive(r)  or Turing Decidable if there exists a TM M such that L=L(M) or L is accepted by some TM and M halts on every w∈ Σ+ .
For recursive language a membership algorithm  exists.
For string not in language it should halt in non final state
that is L' is also recursive or is decidable

Non recursively Enumerable L'
If a language is not accepted by TM then its non recursively enumerable that is there is no algorithm to describe the language
L' is not recursively enumerable and hence not recursive

Recursive enumerable but not recursive
that is there exist a TM that accepts the language but doesnot halt on some input
or There exists a recursively enumerable language that is not recursive.
Universal language Lu is example.

If languages L and L' are both RE, then L is recursive.
implications
if L is not RE then L cannot be recursive
if L' is not RE then L is not recursive
if L is RE but not recursive then L' is not RE

If L is recursive, then L' is recursive.
If G is an unrestricted grammar, then L(G) is recursively enumerable.

Problem Consider language L1, L2, and L3 are recursively enumerable over the alphabet Σ such that
(a) Li ∩ Lj = Φ when i ≠ j; and
(b) L1 ∪ L2 ∪ L3 = Σ*
Prove that L1 is recursive.
Solution
We know that language is recursive if L and its complement L' are both RE
L1 is RE, we check L1' which is (L2 U L3)
L2 , L3 are recursive enumerable and we know that RE are closed under union, so L1' is also RE
Hence L1 is Recursive.

Problem Prove that L* is recursive if L is recursive
Solution Kleen closure is defined in terms of union and since recursive language is closed under union L* is also recursive

Problem 
1) If A is regular and B is regular, then A ∪ B'
regular because regular languages are closed under union and complementation

2)If A is regular and B is context free, then A ∪ B'
A is also context free because every regular language is context free
complement of B may not be CF because context free languages not closed under complementation
So we cannot say whether regular or context free grammar here, we check for recursive language here
B is context free so is recursive also and recursive languages are closed under complementation so B' is also recursive . Similarly we can prove its Recursive

3)If A is context-free and B is regular, then A∪ B' is
Context Free

4)If A is recursive and B is recursive, then A∪ B' is:
Recursive

5)If A is recursive and B is Recursively enumerable, then A∪ B' is:
can not be deduced

6)If A is recursively enumerable and B is recursive then A∪ B' is:
Recursive enumerable

Problem Prove that if a finite set of recursively enumerable languages partition Σ∗ , then the languages are recursive
Solution
Since language complement is also recursively enumerable and we know that L and L' are recursively enumerable then L must be recursive.

Problem If L1 is recursive language. L2 and L3 are recursively enumebrable but not recursive then
1)L2 -L1
L2 -L1 = L2 intersection L1'
L1 is recursive and we know that recursive languages are closed under complementation so L1' is also recursive and every recursive language is also recursively enumerable so L1' is also recursively enumerable
, recursively enumerable language are closed under intersection so its recursively enumerable
2)L1-L3
L1 intersection L3'
L3 is recursively enumerable but complement of L3 is not necessarily recursively enumerable.
so we cannot deduce anything
3)L2 intersection L1
Recursively enumerable


Closure properties of recursive language
Recursive Languages are closed under
1)complementation
because if language L is accepted by TM M and it halts on every input then
machine M' accepts w if M rejects it and M' rejects if M accepts it.
2)difference
3)intersection.
first run on M1 and them on M2 if accepted but both then Accept else reject
4)union.


Closure properties of recursive enumerable languages
1)Recursively enumerable languages are closed under union.
For any two Turing-recognizable language L1 and L2, let M1 and M2 be the TMs that
recognize them. We construct a TM M that recognizes the union of L1 and L2.
- For the input w
- Run M1 and M2 in parallel.
- If either accept, accept
- If both reject, reject.
2)Closed under Intersection
3)Closed under kleen closure
We can use a multi tape Turing machine as all multi tapes Turing machines have an equivalent one tape Turing machine For any Turing-recognizable language L, Let M be the TM that recognizes it. We construct a TM M1 that recognizes the star of L:
On input w, split w into w1w2…wn, save each on a tape
Run M on wi for i=1,2,…,n.
If M accepts each of these string wi, accept.
If reject reached on the possible parts, reject.”
If there is a way to cut w into substrings such M accepts all the substrings, w belongs
to the star of L and M1 will accept w after a finite number of steps.


Not Closed under
5) set difference
6)Complementation (we only need to show that there is a language that is r.e. but not recursive.)

 If L is r.e. but not recursive then L' is not r.e.


Turing Machine
Unlimited and Unrestricted memory
tape as its unlimited memory, tape head to read and write symbols and move around on the tape
Difference between Finite automata
A TM can both read from and write on the tape
Read write head can move both left and right
infinite tape
special states for accepting and rejecting take effect immediately


Every Non Deterministic TM and an equivalent Deterministic TM
that is has same expressive power
A language is decidable if and only if some NTM decides it

Halting problem
recognizer are more powerful than decider
Acceptance of string by TM is called Halting problem
its undecidable

A language is decidable if both L and its complement L' is Turing recognizable
Compliment of Acceptance problem by TM is not TM recognizable else it would be decidable

Decision problem for Type 0 or recursive enumerable language
Undecidable
1)Given M ,  is L(M) regular, context-free, recursive
2)membership inclusion universality equivalence emptiness 

    

w ∈ L? L = ∅? L = Σ*? L1 ⊆ L2? L1 = L2? L1 ∩ L2 = ∅?
Regular
(type 3)
DDDDDD
Det. Context FreeDDDOUU
Context Free
(type 2)
DDUUUU
Context Sensitive
(type 1)
DUUUUU
RecursiveDUUUUU
Recursively Enumerable
(type 0)
UUUUUU

References
www.cs.uiuc.edu/class/fa05/cs475/Lectures/new/bwlec21.pdf     //closure
web.cs.wpi.edu/~kal/courses/cs503/module10/hw9s07solns.pdf  //good problems
http://www.soe.ucsc.edu/classes/cmps132/Winter01/         //problems on turing and recursive lan 
http://www.cs.odu.edu/~toida/nerzic/390teched/regular/reg-lang/non-regularity.html
http://www.cs.odu.edu/~toida/nerzic/390teched/web_course.html  // complete course
http://www.dhedhi.com/tazeen/CS341_F03/notes/mistakes.htm   // frequent mistakes
www2.imm.dtu.dk/courses/02140/L8-MF.pdf //decision property of regular language
http://www.cs.uiuc.edu/class/fa07/cs273/      //problem set  
www.cs.uwaterloo.ca/~watrous/360/assignments/1-solutions.pdf   //problems on prefix substrings 
http://www.cs.utexas.edu/~cline/ear/automata/CS341-Fall-2004-Packet/2-Homework/   //problems 
http://www.cs.wcupa.edu/~rkline/csc520/    //undecidable problems 

No comments:

Post a Comment