Theory of Automata ------------------ Intuitively a finite automaton is a mathematical model of computing device that has discrete inputs and outputs as well as a finite set of internal states. The automaton takes as input a sequence of symbols drawn from an alphabet of symbols, and after reading each symbol in sequence, the automaton makes a transition from its current state to a new state depending on the value of the input read. Mathematically this behavior is specified by a 5-tuple of data (Q,A,T,I,F). Here Q is the set of states of the automaton, finite in number. A is the alphabet from which sequences of input symbols are drawn. T is the transition function from Q*A to Q that takes a state q in Q and a symbol a in A and returns a new state r, which is the state the automaton should make a transition to upon reading the input symbol a in the state q. I is simply an initial state, and F is a subset of Q called the set of final, or accepting states. One can visualize all this data as a directed graph. The nodes of the graph are the states and are labelled with elements of Q. For a given state q, arrows lead out from q to other states, with exactly one arrow for every input symbol in A. These set of arrows represent transitions T of the automaton. The action of the automaton can be visualized as a dot starting at the initial state I and following the arrows as determined by the input. Actual instances of finite automaton in technology are ubiquitious. One familiar example may be the control circuitry of an elevator. The state of the system is given by the current floor the elevator is on, while the input is given by the button pushed. This example is unique in the sense that the alphabet A and the set of states Q are the same, and slightly irregular in the sense that the automaton has no final state; it runs forever as long as people are there to give it input. Indeed any kind of switching circuit can be viewed as a finite state machine. From a theoretical point of view, an important application of finite automata is the problem of language recognition. A language L over an alphabet of symbols A is simply some subset of A*, where A* is the set of all possible sequences of symbols drawn from A. A finite automaton is said to accept a sequence S in A* if and only if the automaton takes as input sequence S and ends up in one of the special accepting states in F right after reading the last input symbol in S. Otherwise the automaton is said to reject S. A finite automaton is said to accept a language L in A* if the automaton accepts a sequence S whenever it is an element of the language L, and rejects a sequence S whenever it is not an element of L. There are most definitely many languages that cannot be recognized by finite automata. Consider the alphabet A consisting of 0 and 1 alone, so that A* is simply the set of binary numbers of any length, possibly with extra zeroes on the left. Furthermore consider the language L consisting of those binary numbers in A* that contain a sequence of n2 1's for any positive integer n. It can be proven that no finite automaton can recognize this language. Intuitively, the idea behind the proof is that any such automaton would have to count an arbitrary number of 1's, say m of them and check that m is a perfect square. Not machine with a finite number of states can do this for arbitrarily large m. Although a finite automaton is not up to the job, a Turing machine can be designed to recognize this language. In fact it turns out finite automata in the grand scheme of computing devices are relatively weak. They form the lowest rung in a hierarchy that includes, in order of increasing power, finite automata, pushdown automata, bounded linear automata, and Turing machines. Pushdown automata are merely finite automata endowed with access to an unbounded stack, and bounded linear automata are Turing machines whose read/write head is not allowed to leave the region of tape where the input is given. Since there are relatively strong limitations on what a finite automaton can do, one may well ask is there a way to characterize all languages that actually can be recognized by finite automata. This set of languages is known as the set of regular languages and can be specified by regular expressions. Regular languages themselves are a very restricted classes of languages, in keeping with the relative weakness of finite automata. Since pushdown automata are more powerful than finite automata, they can recognize a more general class of langages known as context free languages, whereas the linear bounded automata can recognize an even more general class: the context sensitive languges. Finally Turing machines, the most powerful machines can recognize any language that can be recognized by an algorithmic procedure. Such a language is called recursively enumerable. This hierarchy of machines and languages is intimately related to the theory of grammars and is called the Chomsky hieararchy after the lingist Noam Chomsky. A variant of the finite automaton which at first glance may be thought to be more powerful is the nondeterministic finite automaton. The main difference is that the transition rules T may contain for a given input a and state q, multiple arrows leading to different states. Thus the next state is not determined by the current state and the current input, hence the nondeterminism. One can pick any of the states pointed to by the arrows as the next state. A nondeterministic automaton is said to accept an input if there exists any possible choice of transitions that would be consistent with the input and would lead in the end to an accepting state. Thus intuitively the nondeterministic automaton is like the deterministic automaton but has as its disposal the right to make lucky guesses at each transition point that might lead to an accepting state if such a path through state space exists. It turns out that any nondeterministic finite automaton can be simulated by a deterministic one, so the class of languages accepted by both types of automaton are the same. The main use of nondeterministic automata is theoretical; they are simpler to design and are useful in proofs.