Theory of Automata
------------------
Intuitively a finite automaton is a mathematical model of
computing device that has discrete inputs and outputs as well as a
finite set of internal states. The automaton takes as input a sequence
of symbols drawn from an alphabet of symbols, and after reading each symbol
in sequence, the automaton makes a transition from its current state
to a new state depending on the value of the input read. Mathematically
this behavior is specified by a 5-tuple of data (Q,A,T,I,F). Here
Q is the set of states of the automaton, finite in number. A is the
alphabet from which sequences of input symbols are drawn. T is the
transition function from Q*A to Q that takes a state q in Q and a symbol
a in A and returns a new state r, which is the state the automaton should
make a transition to upon reading the input symbol a in the state q. I is
simply an initial state, and F is a subset of Q called the set of final, or
accepting states.
One can visualize all this data as a directed graph. The nodes
of the graph are the states and are labelled with elements of Q. For a given
state q, arrows lead out from q to other states, with exactly one arrow
for every input symbol in A. These set of arrows represent transitions
T of the automaton. The action of the automaton can be visualized as a
dot starting at the initial state I and following the arrows as determined
by the input.
Actual instances of finite automaton in technology are
ubiquitious. One familiar example may be the control circuitry of an
elevator. The state of the system is given by the current floor the
elevator is on, while the input is given by the button pushed. This
example is unique in the sense that the alphabet A and the set of
states Q are the same, and slightly irregular in the sense that the automaton
has no final state; it runs forever as long as people are there to give it
input. Indeed any kind of switching circuit can be viewed as a finite
state machine.
From a theoretical point of view, an important application of
finite automata is the problem of language recognition. A language L
over an alphabet of symbols A is simply some subset of
A*, where A* is
the set of all possible sequences of symbols drawn from A. A finite
automaton is said to accept a sequence S in
A* if and only if the automaton takes as
input sequence S and ends up in one of the special accepting states in
F right after reading the last input symbol in S. Otherwise the
automaton is said to reject S. A finite automaton is said to accept a
language L in A* if the automaton accepts a
sequence S whenever it is an element of the language L, and rejects a
sequence S whenever it is not an element of L.
There are most definitely many languages that cannot be
recognized by finite automata. Consider the alphabet A consisting of
0 and 1 alone, so that A* is simply the set
of binary numbers of any length, possibly with extra zeroes on the
left. Furthermore consider the language L consisting of those binary
numbers in A* that contain a sequence of
n2 1's for any positive integer n. It can
be proven that no finite automaton can recognize this language.
Intuitively, the idea behind the proof is that any such automaton
would have to count an arbitrary number of 1's, say m of them and
check that m is a perfect square. Not machine with a finite number of
states can do this for arbitrarily large m. Although a finite
automaton is not up to the job, a Turing machine can be designed to
recognize this language. In fact it turns out finite automata in the
grand scheme of computing devices are relatively weak. They form the
lowest rung in a hierarchy that includes, in order of increasing power,
finite automata, pushdown automata, bounded linear automata, and Turing
machines. Pushdown automata are merely finite automata endowed with
access to an unbounded stack, and bounded linear automata are Turing machines
whose read/write head is not allowed to leave the region of tape where
the input is given.
Since there are relatively strong limitations on what a finite
automaton can do, one may well ask is there a way to characterize all
languages that actually can be recognized by finite automata. This
set of languages is known as the set of regular languages and can be
specified by regular expressions. Regular languages themselves are a
very restricted classes of languages, in keeping with the relative
weakness of finite automata. Since pushdown automata are more
powerful than finite automata, they can recognize a more general class
of langages known as context free languages, whereas the linear bounded
automata can recognize an even more general class: the context sensitive
languges. Finally Turing machines, the most powerful machines can
recognize any language that can be recognized by an algorithmic procedure.
Such a language is called recursively enumerable. This hierarchy of machines
and languages is intimately related to the theory of grammars and is called
the Chomsky hieararchy after the lingist Noam Chomsky.
A variant of the finite automaton which at first glance may be
thought to be more powerful is the nondeterministic finite automaton.
The main difference is that the transition rules T may contain for a
given input a and state q, multiple arrows leading to different
states. Thus the next state is not determined by the current state
and the current input, hence the nondeterminism. One can pick any of
the states pointed to by the arrows as the next state. A
nondeterministic automaton is said to accept an input if there exists
any possible choice of transitions that would be consistent with the
input and would lead in the end to an accepting state. Thus
intuitively the nondeterministic automaton is like the deterministic
automaton but has as its disposal the right to make lucky guesses at
each transition point that might lead to an accepting state if such a
path through state space exists. It turns out that any
nondeterministic finite automaton can be simulated by a deterministic
one, so the class of languages accepted by both types of automaton are
the same. The main use of nondeterministic automata is theoretical;
they are simpler to design and are useful in proofs.