Psychology Wiki

Formal language

34,189pages on
this wiki
Revision as of 12:56, September 29, 2006 by Lifeartist (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Language: Linguistics · Semiotics · Speech

In mathematics, logic, and computer science, a formal language L is a set of finite-length sequences of elements drawn from a specified finite set A of symbols. Among the more common options that are found in applications, a formal language may be viewed as being analogous to (1) a collection of words or (2) a collection of sentences. In Case 1, the set A is called the alphabet of L, whose elements are called words. In Case 2, the set A is called the lexicon or the vocabulary of L, whose elements are then called sentences. In any case, the mathematical theory that treats formal languages in general is known as formal language theory.

Although it is common to hear the term formal language used in other contexts to refer to a mode of expression that is more disciplined or more precise than everyday speech, the sense of formal language discussed in this article is restricted to its meaning in formal language theory.

An alphabet might be \left \{ a , b \right \}, and a string over that alphabet might be ababba.

A typical language over that alphabet, containing that string, would be the set of all strings which contain the same number of symbols a and b.

The empty word (that is, length-zero string) is allowed and is often denoted by e, \epsilon or \Lambda. While the alphabet is a finite set and every string has finite length, a language may very well have infinitely many member strings (because the length of words in it may be unbounded).

A question often asked about formal languages is "how difficult is it to decide whether a given word belongs to a particular language?" This is the domain of computability theory and complexity theory.


Some examples of formal languages:

  • the set of all words over {a, b}
  • the set \left \{ a^{n}\right\}, n is a natural number and a^{n} means a repeated n times
  • Finite languages, such as {a, aa, bba} -
  • the set of syntactically correct programs in a given programming language; or
  • the set of inputs upon which a certain Turing machine halts.


A formal language can be specified in a great variety of ways, such as:


Several operations can be used to produce new languages from given ones. Suppose L_{1} and L_{2} are languages over some common alphabet.

  • The concatenation L_{1}L_{2} consists of all strings of the form vw where v is a string from L_{1} and w is a string from L_{2}.
  • The intersection L_1 \cap L_2 of L_{1} and L_{2} consists of all strings which are contained in L_1 and also in L_{2}.
  • The union L_1 \cup L_2 of L_{1} and L_{2} consists of all strings which are contained in L_{1} or in L_{2}.
  • The complement of the language L_{1} consists of all strings over the alphabet which are not contained in L_{1}.
  • The right quotient L_{1}/L_{2} of L_{1} by L_{2} consists of all strings v for which there exists a string w in L_{2} such that vw is in L_{1}.
  • The Kleene star L_{1}^{*} consists of all strings which can be written in the form w_{1}w_{2}...w_{n} with strings w_{i} in L_{1} and n \ge 0. Note that this includes the empty string \epsilon because n = 0 is allowed.
  • The reverse L_{1}^{R} contains the reversed versions of all the strings in L_{1}.
  • The shuffle of L_{1} and L_{2} consists of all strings which can be written in the form v_{1}w_{1}v_{2}w_{2}...v_{n}w_{n} where n \ge 1 and v_{1},...,v_{n} are strings such that the concatenation v_{1}...v_{n} is in L_{1} and w_{1},...,w_{n} are strings such that w_{1}...w_{n} is in L_{2}.

See also

Further reading

  • Hopcroft, J. & Ullman, J (1979). Introduction to Automata Theory, Languages, and Computation. Addison Wesley. ISBN 0-201-02988-X
  • G. Rozenberg, A. Salomaa eds. Handbook of Formal Languages. Springer-Verlag. 3 vol.

Automata, Languages and Programming.

Automata theory: formal languages and formal grammars
Grammars Languages Minimal
Type-0 Unrestricted Recursively enumerable Turing machine
n/a (no common name) Recursive Decider
Type-1 Context-sensitive Context-sensitive Linear-bounded
Type-2 Context-free Context-free Pushdown
Type-3 Regular Regular Finite
Each category of languages or grammars is a proper superset of the category directly beneath it.
ar:لغة شكلية

bg:Формален език cs:Formální jazyk de:Formale Sprache el:Τυπική γλώσσα es:Lenguaje formal fr:Langage formel ko:형식 언어he:שפה פורמלית hu:Formális nyelv nl:Formele taalpt:Linguagem formal ro:Limbaje formale ru:Формальный язык sk:Formálny jazyk fi:Formaali kielizh:形式语言

This page uses Creative Commons Licensed content from Wikipedia (view authors).

Around Wikia's network

Random Wiki