Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Language: Linguistics · Semiotics · Speech

The Oxford English Corpus is a text corpus of English language used by the makers of the Oxford English Dictionary and by Oxford University Press's language research programme. It is the largest corpus of its kind, containing over two billion words.[1] The sources for these words are writings of all sorts, from "literary novels and specialist journals to everyday newspapers and magazines and from Hansard to the language of chatrooms, emails, and weblogs"[2]. This may be contrasted with similar databases that sample only a specific kind of writing.

The digital version of the Oxford English Corpus is formatted in XML and usually analysed with Sketch Engine software.[3]

Each document in the OE Corpus is accompanied by metadata naming:

  • title
  • author (if known; many websites make this difficult to determine reliably)
  • author gender (if known)
  • language type (e.g. British English, American English)
  • source website
  • year (+ date, if known)
  • date of collection
  • domain + subdomain
  • document statistics (number of tokens, sentences, etc.)[3]


  1. How the OED got shorter. Retrieved: 2 December 2007.
  2. The Oxford English Corpus. Retrieved 2 December 2007.
  3. 3.0 3.1 Technical information. Retrieved June 22, 2006.

See alsoEdit

External linksEdit

This page uses Creative Commons Licensed content from Wikipedia (view authors).

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.