# Formal language

*34,192*pages on

this wiki

Assessment |
Biopsychology |
Comparative |
Cognitive |
Developmental |
Language |
Individual differences |
Personality |
Philosophy |
Social |

Methods |
Statistics |
Clinical |
Educational |
Industrial |
Professional items |
World psychology |

**Language:**
Linguistics ·
Semiotics ·
Speech

In mathematics, logic, and computer science, a **formal language** **L** is a set of finite-length sequences of elements drawn from a specified finite set **A** of symbols. Among the more common options that are found in applications, a formal language may be viewed as being analogous to (1) a collection of words or (2) a collection of sentences. In Case 1, the set **A** is called the *alphabet* of **L**, whose elements are called *words*. In Case 2, the set **A** is called the *lexicon* or the *vocabulary* of **L**, whose elements are then called *sentences*. In any case, the mathematical theory that treats formal languages in general is known as *formal language theory*.

Although it is common to hear the term *formal language* used in other contexts to refer to a mode of expression that is more disciplined or more precise than everyday speech, the sense of *formal language* discussed in this article is restricted to its meaning in formal language theory.

An alphabet might be , and a string over that alphabet might be .

A typical language over that alphabet, containing that string, would be the set of all strings which contain the same number of symbols and .

The **empty word** (that is, length-zero string) is allowed and is often denoted by , or . While the alphabet is a finite set and every string has finite length, a language may very well have infinitely many member strings (because the length of words in it may be unbounded).

A question often asked about formal languages is "how difficult is it to decide whether a given word belongs to a particular language?" This is the domain of computability theory and complexity theory.

## Contents

[show]## Examples

Some examples of formal languages:

- the set of all words over
- the set , n is a natural number and means repeated times
- Finite languages, such as -
- the set of syntactically correct programs in a given programming language; or
- the set of inputs upon which a certain Turing machine halts.

## Specification

A formal language can be specified in a great variety of ways, such as:

- Strings produced by some formal grammar (see Chomsky hierarchy);
- Strings produced by a regular expression;
- Strings accepted by some automaton, such as a Turing machine or finite state automaton;
- From a set of related YES/NO questions, those questions for which the answer is YES — see decision problem.

## Operations

Several operations can be used to produce new languages from given ones. Suppose and are languages over some common alphabet.

- The
*concatenation*consists of all strings of the form where is a string from and is a string from . - The
*intersection*of and consists of all strings which are contained in and also in . - The
*union*of and consists of all strings which are contained in or in . - The
*complement*of the language consists of all strings over the alphabet which are not contained in . - The
*right quotient*of by consists of all strings for which there exists a string in such that is in . - The
*Kleene star*consists of all strings which can be written in the form with strings in and . Note that this includes the empty string because is allowed. - The
*reverse*contains the reversed versions of all the strings in . - The
*shuffle*of and consists of all strings which can be written in the form where and are strings such that the concatenation is in and are strings such that is in .

## See also

- Language for languages in general
- Syntax for the form of a language in general
- Semantics for the meanings in a language
- Natural language for languages that are not formal
- Computer language for application of formal languages in computing
- Programming language for the application of formal languages to program computers

## Further reading

- Hopcroft, J. & Ullman, J (1979).
*Introduction to Automata Theory, Languages, and Computation.*Addison Wesley. ISBN 0-201-02988-X

- G. Rozenberg, A. Salomaa eds. Handbook of Formal Languages. Springer-Verlag. 3 vol.

- http://icalp06.dsi.unive.it/ ICALP 2006 33rd International Colloquium on

Automata, Languages and Programming.

- http://www.cs.auckland.ac.nz/CDMTCS/conferences/dlt/DLTConfSeries.html International Conferences on Developments in Language Theory

Automata theory: formal languages and formal grammars
| |||
---|---|---|---|

Chomsky
hierarchy | Grammars
| Languages | Minimal
automaton |

Type-0 | Unrestricted | Recursively enumerable | Turing machine |

n/a | (no common name) | Recursive | Decider |

Type-1 | Context-sensitive | Context-sensitive | Linear-bounded |

Type-2 | Context-free | Context-free | Pushdown |

Type-3 | Regular | Regular | Finite |

Each category of languages or grammars is a proper superset of the category directly beneath it. |

bg:Формален език cs:Formální jazyk de:Formale Sprache el:Τυπική γλώσσα es:Lenguaje formal fr:Langage formel ko:형식 언어he:שפה פורמלית hu:Formális nyelv nl:Formele taalpt:Linguagem formal ro:Limbaje formale ru:Формальный язык sk:Formálny jazyk fi:Formaali kielizh:形式语言

This page uses Creative Commons Licensed content from Wikipedia (view authors). |