Thursday, May 31, 2007

Examples of Regular Expressions and Regular Languages.

examples of Regular expressions and Regular Languages.
Ex.:
Let r1 and r2 be arbitrary regular expressions over some alphabet. Find a simple (the shortest and with the smallest nesting of * and +) regular expression which is equal to each of the following regular expressions.

(a) (r1 + r2 + r1r2 + r2r1)*
(b) (r1(r1 + r2)*)+

Solution:
One general strategy to approach this type of question is to try to see whether or not they are equal to simple regular expressions that are familiar to us such as a, a*, a+, (a + b)*, (a + b)+ etc.

(a) Since (r1 + r2)* represents all strings consisting of strings of r1 and/or r2 , r1r2 + r2r1 in the given regular expression is redundant, that is, they do not produce any strings that are not represented by (r1 + r2)*. Thus (r1 + r2 + r1r2 + r2r1)* is reduced to (r1 + r2)*.

(b) (r1(r1 + r2)*)+ means that all the strings represented by it must consist of one or more strings of (r1(r1 + r2)*). However, the strings of (r1(r1 + r2)*) start with a string of r1 followed by any number of strings taken arbitrarily from r1 and/or r2.
Thus anything that comes after the first r1 in (r1(r1 + r2)*)+ is represented by (r1 + r2)*. Hence (r1(r1 + r2)*) also represents the strings of (r1(r1 + r2)*)+, and conversely (r1(r1 + r2)*)+ represents the strings represented by (r1(r1 + r2)*). Hence (r1(r1 + r2)*)+ is reduced to (r1(r1 + r2)*).

In our next Posts, we will see some more good examples of Regular expressions and Regular Languages.

Wednesday, May 30, 2007

Examples of regular expression and regular languages

So far, we have discussed several important formal definitions. Now we will see some good examples of them. We will discuss other important definitions time to time.

Examples of regular expression and regular languages corresponding to them

•( a + b )2 corresponds to the language {aa, ab, ba, bb}, that is the set of strings of length 2 over the alphabet {a, b}.

In general ( a + b )k corresponds to the set of strings of length k over the alphabet {a, b}. ( a + b )* corresponds to the set of all strings over the alphabet {a, b}.

•a*b* corresponds to the set of strings consisting of zero or more a's followed by zero or more b's.

•a*b+a* corresponds to the set of strings consisting of zero or more a's followed by one or more b's followed by zero or more a's.

•( ab )+ corresponds to the language {ab, abab, ababab, ... }, that is, the set of strings of repeated ab's.


Note:
A regular expression is not unique for a language. That is, a regular language, in general, corresponds to more than one regular expression. For example (a + b)* and ( a*b* )* correspond to the set of all strings over the alphabet {a, b}.


Definition of Equality of Regular Expressions

Regular expressions are equal if and only if they correspond to the same language.

Thus for example (a + b)* = (a*b*)*, because they both represent the language of all strings over the alphabet {a, b}.

In general, it is not easy to see by inspection whether or not two regular expressions are equal.

Thursday, May 24, 2007

Deterministic

M is deterministic if it satisfies both the following conditions:
• For any q belongs to Q, a belongs to Σ U {Λ} , X belongs to Γ, the set δ(q,a,X) has at most one element.
• For any q belongs to Q, X belongs to Γ, if δ(q, Λ, X) ≠ Ø , then δ(q,a,X) = Ø for every a belongs to Σ

DPDA & NPDA are different only because DPADA is deterministic and it is weaker than NPDA.

Wednesday, May 23, 2007

Deterministic Push down Automata (DPDA)

“Deterministic Push down Automata” (DPDA):

Formal definition:
A PDA M can be defined as a 7-tuple:
M = (Q,Σ,Γ,q0,Z0,A,δ) where
 Q is a finite set of states
 Σ is a finite set of the input alphabet
 Γ is a finite set of the stack alphabet
 q0 is the start state, an element of Q
 Z0 is the initial stack symbol, an element of Γ
 A is the set of final states, a subset of Q
 δ is a finite transition relation (Q x (Σ U {Λ} x Γ ) ----> the set of finite subsets of (Q x Γ* )

Tuesday, May 22, 2007

Acceptance Criteria for PDA

There are two possible acceptance criteria for PDA:

1. acceptance by empty stack
2. acceptance by final state.

The two are easily shown to be equivalent: a final state can perform a pop loop to get to an empty stack, and a machine can detect an empty stack and enter a final state by detecting a unique symbol pushed by the initial state.

Some use a 6-tuple, dropping the Ω for the initial stack symbol, instead adding a first transition which writes a start symbol to the stack.

Tomorrow, we will see formal definition of “Deterministic Push down Automata” (DPDA).

Monday, May 21, 2007

formal definition of Push down Automata.

In this post we will concentrate on formal definition of Push down Automata. It is also called “Non deterministic Push down Automata” (NPDA). But remember that NPDA is different from “Deterministic Push down Automata” (DPDA)
“Non deterministic Push down Automata” (NPDA)
Formal definition:
A NPDA W can be defined as a 7-tuple:
W = (Q,Σ,Φ,σ,s,Ω,F) where
• Q is a finite set of states
• Σ is a finite set of the input alphabet
• Φ is a finite set of the stack alphabet
• σ (or sometimes δ) is a finite transition relation
(Q x (Σ U {ε} x Ø) ----> P (Q x Ø)
• s is an element of Q the start state
• Ω is the initial stack symbol
• F is subset of Q, consisting of the final states.

Sunday, May 20, 2007

Context-Free-Language (CFL)

A language L is said to be a Context-Free-Language (CFL) if its grammar is Context-Free. More precisely, it is a language whose words, sentences and phrases are made of symbols and words from a Context-Free-Grammar. Usually, CFL is of the form L=L(G). Given below are examples for CFG but not for CFL.
Here I am giving you one example of context-free grammar
Example:
A context-free grammar for the language consisting of all strings over {a,b} which contain a different number of a's to b's is
S → U | V
U → TaU | TaT
V → TbV | TbT
T → aTbT | bTaT | ε
Here, T can generate all strings with the same number of a's as b's, U generates all strings with more a's than b's and V generates all strings with fewer a's than b's.

Saturday, May 19, 2007

Context-Free Grammar (formal definition)

Definition of Context-Free Grammar includes the definition of Context free language. So please read the following definition carefully.

Just as any formal grammar, a context-free grammar G can be defined as a 4-tuple:
G = (Vt,Vn,P,S) where
 Vt is a finite set of terminals
 Vn is a finite set of non-terminals
 P is a finite set of production rules
 S is an element of Vn, the distinguished starting non-terminal.
 elements of P are of the form

Vn(Vt U Vn)*

Friday, May 18, 2007

left regular grammar

In a left regular grammar, all rules obey the forms
1. A → a - where A is a non-terminal in N and a is a terminal in Σ
2. A → Ba - where A and B are in N and a is in Σ
3. A → ε - where A is in N and ε is the empty string.
An example of a right regular grammar G with N = {S, A}, Σ = {a, b, c}, P consists of the following rules
S → aS
S → bA
A → ε
A → cA
and S is the start symbol. This grammar describes the same language as the regular expression a*bc*.
A regular grammar is a left regular or right regular grammar.

Thursday, May 17, 2007

Formal Definition of Regular Grammar:

Regular Grammar and Regular Expressions are different.

We have discussed Regular expressions. Let’s see Regular Grammar.

Formal Definition of Regular Grammar:

In Computer Science, a right regular grammar is a formal grammar (N, Σ, P, S) such that all the production rules in P are of one of the following forms:
1. A → a - where A is a non-terminal in N and a is a terminal in Σ
2. A → aB - where A and B are in N and a is in Σ
3. A → ε - where A is in N and ε denotes the empty string, i.e. the string of length 0.

Wednesday, May 16, 2007

Formal Definition of Regular expression:

Regular expressions can be expressed in terms of formal Language Theory, Regular expressions consist of constants and operators that denote sets of strings and operations over these sets, respectively. Given a finite alphabet Σ the following constants are defined:

(empty set) ∅ denoting the set ∅
(empty string) ε denoting the set {ε}
(Literal character ) a in Σ denoting the set {a}
and the following operations:
(concatenation) RS denoting the set { αβ | α in R and β in S }. For example {"ab", "c"}{"d", "ef"} = {"abd", "abef", "cd", "cef"}.
(alternation) R|S denoting the set union of R and S.
(Kleene Star )R* denoting the smallest superset of R that contains ε and is closed under string concatenation. This is the set of all strings that can be made by concatenating zero or more strings in R. For example, {"ab", "c"}* = {ε, "ab", "c", "abab", "abc", "cab", "cc", "ababab", ... }.
The above constants and operators form a Kleene Algebra.

Many textbooks use the symbols ∪, +, or ∨ for alternation instead of the vertical bar.To avoid brackets it is assumed that the Kleene star has the highest priority, then concatenation and then set union. If there is no ambiguity then brackets may be omitted. For example, (ab)c is written as abc and a|(b(c*)) can be written as a|bc*.
After this we will discuss formal definition of Regular Grammar.

Tuesday, May 15, 2007

Non Deterministic Finite Automata (NFA)-continued

For every NFA a deterministic finite state machine (DFA) can be found that accepts the same language. Therefore it is possible to convert an existing NFA into a DFA for the purpose of implementing a (perhaps) simpler machine. This can be performed using the power set construction which may lead to an exponential rise in the number of necessary states.

NFA is very important for depth knowledge of Automata. You will see that almost everything in Automata Theory is related to NFA.
See you later..

Monday, May 14, 2007

Formal Definition of Non-Deterministic Finite State Machine (NFA)

A nondeterministic finite state automaton (NFA) is a 5-tuple, (S, Σ, T, s0, A), consisting of
 a finite set of states (S)
 a finite set of input symbols (Σ)
 a transition function (T : S × (Σ ∪{ε}) → P(S)).
 an initial (or start) state s0 such that s0 ∈ S
 a set of states A distinguished as accepting (or final) states (A ⊆ S)
where P(S) is the power set of S, ε is the empty string, and Σ is the input symbol alphabet.

Given an NFA M.
Given a string w.
There is any number of computation paths of M with input w.
M accepts w if some computation path ends in a final state.

Sunday, May 13, 2007

Deterministic Finite Automata (DFA)

DFAs are equivalent in computing power to NFAs (nondeterministic finite automata).On the other hand, DFAs are of strictly limited power in the languages they can recognize — many simple languages, including any problem that requires more than constant space to solve, cannot be recognized by a DFA.
The classical example of a simply described language that no DFA can recognize is the language consisting of strings of the form anbn — some finite number of a's, followed by an equal number of b's. It can be shown that no DFA can have enough states to recognize such a language.
See you later with our next post….

Formal Definition of Deterministic Finite State Machine (DFA):

Formal Definition of Deterministic Finite State Machine (DFA):

A DFA is a 5-tuple, (S, Σ, T, s, A), consisting of

 a finite set of states (S)
 a finite set called the alphabet (Σ)
 a transition function (T : S × Σ → S)
 a start state (s ∈ S)
 a set of accept states (A ⊆ S)

Given an DFA M.
Given a string w.
There is at most one computation path of M with input w.
M accepts w if that computation path ends in a final state.

Saturday, May 12, 2007

regular language-formal defnition continued

A regular language is a formal language (i.e., a possibly infinite set of finite sequences of symbols from a finite alphabet) that satisfies the following equivalent properties:
 it can be accepted by a deterministic finite state machine
 it can be accepted by a nondeterministic finite state machine
 it can be accepted by an alternating finite automaton
 it can be described by a regular expression
 it can be generated by a regular grammar
 it can be accepted by a read-only Turing machine
In next post, I will discuss formal definition of deterministic finite state machine (DFA). DFA is my Favorite topic.
I am sure you will also like it when we will discuss it in detail after completing all formal definitions.

Friday, May 11, 2007

Regular Language-formal definition

As I said in my previous post, we will discuss formal definitions of a single term at a time. Today, we are going to discuss “Regular languages”

Regular languages over an alphabet (Formal Definition):
The collection of regular languages over an alphabet Σ is defined recursively as follows:
 the empty language Ø is a regular language.
 the empty string language { ε } is a regular language.
 For each a ∈ Σ, the singleton language { a } is a regular language.
 If A and B are regular languages, then A ∪ B (union), A B (concatenation), and A* (Kleene star) are regular languages

Thursday, May 10, 2007

NLIN SPACE AND LIN SPACE

As I said in this post I am going to continue context sensitive languages and grammars.
Computationally the context-sensitive languages are equivalent with linear bounded non-deterministic Turing machine. That is a non-deterministic Turing machine with a tape of only kn cells, where n is the size of the input and k is a constant associated with the machine.
This means that every formal language that can be decided by such a machine is a context-sensitive language, and every context-sensitive language can be decided by such a machine.
This set of languages is also known as NLIN-SPACE, because they can be accepted using linear space on a non-deterministic Turing machine.
The class LIN-SPACE is defined the same, except using a deterministic Turing machine. Clearly LIN-SPACE is a subset of NLIN-SPACE, but it is not known whether LIN-SPACE=NLIN-SPACE. It is widely suspected they are not equal.

Wednesday, May 9, 2007

context sensitive language continued

Note that there must be the same number of symbols on both the left and right sides of the rule. Rules that are applied to symbols must consider the "context" of the symbol.

For example, in the previous rule, a "B" can be turned into an "X" if (but not necessarily only if) it has a preceding "A" and a following "C". This makes it difficult to modify assumptions since individual symbols cannot be interpreted on their own. Also, the number of rules needed to fully specify the grammar might become very large, since there could be many special cases that need to be considered.

We will continue this topic in the next post..

Monday, May 7, 2007

context sensitive languages

I forget to discuss about context sensitive grammars and context sensitive languages. These grammars are not as important as context free grammars and context free languages.

Let’s discuss them….

Context-Sensitive Language:

Context-Sensitive Language is a formal language that can be defined by a context-sensitive grammar. That is one of the four types of grammars in the Chomsky hierarchy. Of the four, this is the least often used, in both theory and practice.

Also known as "type-1" grammars, this class is restricted to Turing machines that can only examine states immediately before and after the current state. So, the grammar might include a rule to turn a sequence A B C into A X C (A B C --> A X B).

Sunday, May 6, 2007

Pushdown Automata

Pushdown automata can also manipulate the stack, as part of performing a transition. Normal finite state machines choose a new state, the result of following the transition. The manipulation can be to push a particular symbol to the top of the stack, or to pop off the top of the stack. The automaton can alternatively ignore the stack, and leave it as it is. The choice of manipulation (or no manipulation) is determined by the transition table.

For every context-free grammar, there exists a pushdown automaton such that the language generated by the grammar is identical with the language generated by the automaton. Conversely, for every pushdown automaton there exists a context-free grammar such that the language generated by the automaton is identical with the language generated by the grammar.

Now, we know all important basic definitions. So, what is the next step?

Well, Next step is to concentrate on the formal definitions of all previously discussed terms. In my next posts, we will focus on one term at a time.

So be prepared for the next post….

Saturday, May 5, 2007

Automata Step by Step

In the previous post we discussed about “context-free languages” and “context-free grammar”. But without understanding the “pushdown automaton”, you can’t get depth knowledge of them.

Let’s see definition of “pushdown automaton”

Pushdown automaton:

A pushdown automaton (PDA) is a finite automaton that can make use of a stack containing data.

Pushdown automata choose a transition by indexing a table by input signal, current state, and the top of the stack. Normal finite state machines just look by input signal and current state: they have no stack to work with. Pushdown automata add the stack as a parameter for choice. Given an input signal, current state, and a given symbol at the top of the stack, a transition path is chosen.

Thursday, May 3, 2007

context free language and context free grammar

Today we will discuss “context-free languages” and “context-free grammar”
Context-free language:
A context-free language is a formal language that is a member of the set of languages defined by context-free grammars. The set of context-free languages is identical to the set of languages accepted by pushdown automata.
To understand it better we will need the definition of context-free grammars.
Context-free grammar:
A context-free grammar (CFG) is a formal grammar in which every production rule is of the form
V → w
Where V is a nonterminal symbol and w is a string consisting of terminals and/or non-terminals. The term "context-free" expresses the fact that the non-terminal V can always be replaced by w, regardless of the context in which it occurs. A formal language is context-free if there is a context-free grammar that generates it.
Familiarly known as "CFGs", or "type-2" grammars. Here, there can only be one nonterminal symbol on the left-hand "assumption" side of the rule. Luckily, any rule can be rewritten to move all symbols to the right-hand "conclusion" side as needed (for example, A B ® C D is the same as A ® B C D. Rules can be applied to individual symbols since there is only one given assumption (like the "A" in the previous example). This means "CFGs are popular for natural language grammars."
Context-free grammars are powerful enough to describe the syntax of most programming languages; in fact, the syntax of most programming languages is specified using context-free grammars. On the other hand, context-free grammars are simple enough to allow the construction of efficient parsing algorithms which, for a given string, determine whether and how it can be generated from the grammar.

Next post will be based on pushdown automata. So keep in touch….

Wednesday, May 2, 2007

Automata step by step

Deterministic Finite Automaton (DFA):
A deterministic finite state machine or deterministic finite automaton (DFA) is a finite state machine where for each pair of state and input symbol there is one and only one transition to a next state. DFAs recognize the set of regular languages and no other languages.
A DFA will take in a string of input symbols. For each input symbol it will then transition to a state given by following a transition function. When the last input symbol has been received it will either accept or reject the string depending on whether the DFA is in an accepting state or a non-accepting state.
Non-Deterministic Finite Automaton (NFA):
A nondeterministic finite state machine or nondeterministic finite automaton (NFA) is a finite state machine where for each pair of state and input symbol there may be several possible next states. Non-deterministic finite state machines are sometimes studied by the name sub shifts of finite type.
So far, we have discussed a lot of definitions, but still we need to know some more definitions. Don’t be impatience, once you become familiar with all the definitions and terms then it will be very interesting. Don’t forget to read my next Post.

Tuesday, May 1, 2007

Automata Step by Step

Now we will discuss regular languages in this post.

Regular language:
A regular language is the set of strings generated by a regular grammar. Regular grammars are also known as Type-3 grammars in the Chomsky hierarchy.
A regular grammar can be represented by a deterministic or non-deterministic finite automaton. Such automata can serve to either generate or accept sentences in a particular regular language. Note that since the set of regular languages is a subset of context-free languages, any deterministic or non-deterministic finite automaton can be simulated by a pushdown automaton.
Now you will be curious to know about the terms “deterministic finite automaton”, “non-deterministic finite automaton”, “pushdown automaton” and “context-free languages”.
In next Post, we will discuss only “deterministic finite automaton” and “nondeterministic finite state machine”.