1.4. Languages: Basic Concepts
Basic data structure or input to grammars or
automaton are strings. Strings are defined over an alphabet which is
finite. Alphabet may vary depending upon the application. Elements of an
alphabet are called symbols. Usually we denote the basic alphabet set
either as Σ or T. For example, the following are a few examples of an alphabet set.
Σ1 = {a,b}
Σ2 = {0,1,2}
Σ3 = {0,1,2,3,4,5,6,7,8,9}. |
A string or word is a finite sequence of symbols from
that alphabet, usually written as concatenated symbols and not
separated by gaps or commas. For example, if Σ = {a, b}, a string abbab is a string or word over Σ. If w is a string over an alphabet Σ, then the length of w written as len(w) or |w| is the number of symbols it contains. If |w| = 0, then w is called as empty string denoted either as λ or ε.
For any word w, w ε = εw = w. For any string w = a1 ... an of length n, the reverse of w is written as wR which is the string anan−1 ... a1, where each symbol ai belongs to the basic alphabet Σ. A string z that is appearing consecutively within another string w is called a substring or subword of w. For example, aab is a substring of baabb.
The set of all strings over an alphabet Σ is denoted
by Σ* which includes the empty string ε. For example, for Σ = {0, 1}, Σ*
= {ε, 0, 1, 00, 01, 10, 11, ...}. Note that Σ* is a countably infinite set. Also, Σn denotes the set of all strings over Σ whose length is n. Hence, Σ* = Σ0 ∪ Σ1 ∪ Σ2 ∪ Σ3 ... and Σ+ = Σ1 ∪ Σ2 ∪ Σ3 .... Subsets of Σ* are called languages.
For example, if Σ = {a, b}, the following are languages over Σ
L1 = {ε, a, b}
L2 = {ab, aabb, aaabbb, ...}
L3 = {w ε Σ*| number of a’s and number of b’s in w are equal.} |
In the above example, L1 is finite, and L2 and L3 are infinite languages. φ denotes an empty language.
We have the following inductive definition of Σ+ and Σ*, where Σ is any basic alphabet set.
Definition 1.4
Let Σ be any alphabet set. Σ+ is a set of non-empty strings over Σ defined as follows:
Clearly, the set Σ+ contains all strings of length n, n ≥ 1.
|
Example 1.11.
Let Σ = {0,1,2}. Then
Suppose we wish to include ‘ε’ in Σ+, we modify the above definition as given below.
|
Definition 1.5
Let Σ be any alphabet set. Σ* is defined as follows:
Since languages are sets, one can define the
set-theoretic operations of union, intersection, difference, and
complement in the usual fashion.
The following operations are also defined for languages. If x = a1 ... an, y = b1 ... bm, the concatenation of x and y is defined as xy = a1 ... anb1 ... bm. The catenation (or concatenation) of two languages L1 and L2 is defined by, L1L2 = {w1w2|w1 ∊ L1 and w2 ∊ L2}.
Note that concatenation of languages is associative because concatenation of strings is associative. Also L0 = {ε} and Lφ = φL = φ, Lε = εL = L.
The concatenation closure (Kleene closure) of a language L, in symbols L* is defined to be the union of all powers of L:
The right quotient and right derivative are the following sets, respectively.
Similarly, left quotient of a language L1 by a language L2 is defined by
The left derivative of a language L with respect to a word y is denoted as L which is equal to {z|yz ∊ L}.
The mirror image (or reversal) of a language is the collection of the mirror images of its words and mir (L) = {mir (w)|w ∊ L} or LR = {wR|w ∊ L}.
The operations, substitutions and homomorphisms are defined as follows.
For each symbol a of an alphabet Σ, let σ(a) be a language over Σa. Also, σ(ε) = ε, σ(αβ)= σ(α).σ(β) for α, β ∊ Σ+. σ is a mapping from Σ* to 2v* or use (v*) where V is the union of the alphabets Σa, is called a substitution. For a language L over Σ, we define:
A substitution is ε-free if and only if none of the language σ(a) contains ε. A family of languages is closed under substitution if and only if whenever L is in the family and σ is a substitution such that σ(a) is in the family, then σ(L) is also in the family.
A substitution σ such that σ(a) consists of a single word wa is called a homomorphism. It is called ε-free homomorphism if none of σ(a) is ε.
Algebraically, one can see that Σ* is a free semigroup with ε
as its identity. The homomorphism which is defined above agrees with
the customary definition of homomorphism of one semigroup into another.
Inverse homomorphism can be defined as follows:
It should be noted that h(h-1(L)) need not be equal to L. Generally h(h−1(L))⊆ L and h−1(h(L)) ⊇ L.
|
Asymptotic Behavior of Functions
The
time taken to execute any algorithm depends upon the machine on which
it is implemented and also on the algorithm. Hence, efficiency of any
algorithm is measured by the amount of time it takes and also the space
it needs for execution on the machine. Comparison of algorithms has
become an important topic. We give here a mathematical basis for
comparing algorithms.
A complexity function f is a function of n, the size of the problem or parameter on which the problem is dependent. That is, f(n) is either a measure of the time required to execute an algorithm on a problem of size n or the measure of memory space. If f(n) is describing the measure of time, then f(n) is called the time-complexity function; if f(n) is describing the measure of space, then it is called space-complexity function.
We have the following important definitions of complexity functions.
Definition 1.6
|
Example 1.12.
|
Definition 1.7
|
Example 1.13.
If f(n) = an3 + bn2 + cn + d, where a, b, c, d are constants and a >0. Then f(n) ∊ θ(n3).
|
Definition 1.8
|
Example 1.14.
f(n) = an2 + bn + c, a, b, c, are constants, a > 0, belongs to Ω(n2).
No comments:
Post a Comment