Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Language: Linguistics · Semiotics · Speech

In computational linguistics, a frequency list is a sorted list of words (word types) together with their frequency, where frequency here usually means the number of occurrences in a given corpus. A short example could be: </br> </br>

he 2098762
king 57897
boy 56975
outrageous 76
stringyfy 5
transducionalify 1

It seems that Zipf's law holds for frequency lists drawn from longer texts of any natural language. Frequency lists are a necessary prerequisite for building of an electronic dictionary, which is by itself a prerequisite for a wide range of applications in computational linguistics.

German linguists define the häufigkeitsklasse (frequency class) N of an item in the list using the base 2 logarithm of the ratio between its frequency and the frequency of the most frequent item. The most common item belongs to frequency class 0 (zero) and any item that is approximately half as frequent belongs in class 1. In the example list above, the misspelled word outragious has a ratio of 76/3789654 and belongs in class 16.

N=\left\lfloor0.5-\log_2\left(\frac{\text{Frequency of this item}}{\text{Frequency of most common item}}\right)\right\rfloor

where \lfloor\ldots\rfloor is the floor function.

Lists of wordsEdit

See also Edit

This page uses Creative Commons Licensed content from Wikipedia (view authors).

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.