Wikia

Psychology Wiki

Outlier

Talk0
34,142pages on
this wiki
Revision as of 19:09, March 7, 2006 by Lifeartist (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory


In statistics, an outlier is a single observation "far away" from the rest of the data.

out·li·er n. 1.One whose domicile lies at an appreciable distance from his or her place of business. 2.A value far from most others in a set of data: “Outliers make statistical analyses difficult” (Harvey Motulsky). 3.Geology. A portion of stratified rock separated from a main formation by erosion.

In most samplings of data, some data points will be further away from their expected values than what is deemed reasonable. This can be due to systematic error or faults in the theory that generated the expected values. Outlier points can therefore indicate faulty data, erroneous procedures, or areas where a certain theory might not be valid. However, a small number of outliers is expected in normal distributions.

Mathematical definitions Edit

Mild outliers Edit

Defining Q_1 and Q_3 to be the first and third quartiles, and IQR to be the interquartile range (Q_3-Q_1), one possible definition of being "far away" in this context is:

< Q_1 - 1.5\cdot IQR,

or

> Q_3 + 1.5\cdot IQR
Q_1 and Q_3

define the so-called inner fences, beyond which an observation would be labeled a mild outlier.

Extreme outliers Edit

Extreme outliers are observations that are beyond the outer fences:

< Q_1 - 3\cdot IQR,

or

> Q_3 + 3\cdot IQR.

Occurrence and causes Edit

In the case of normally distributed data, using the above definitions, only about 1 in 150 observations will be a mild outlier, and only about 1 in 425,000 an extreme outlier. Because of this, outliers usually demand special attention, since they may indicate problems in sampling or data collection or transcription.

Alternatively, an outlier could be the result of a flaw in the assumed theory, calling for further investigation by the researcher.

Non-normal distributions Edit

Even when a normal model is appropriate to the data being analyzed, outliers are expected for large sample sizes and should not automatically be discarded if that is the case. Also, the possibility should be considered that the underlying distribution of the data is not approximately normal, having "fat tails". For instance, when sampling from a Cauchy distribution, the sample variance increases with the sample size, the sample mean fails to converge as the sample size increases, and outliers are expected at far larger rates than for a normal distribution.

External link Edit

See also Edit

This page uses Creative Commons Licensed content from Wikipedia (view authors).

Around Wikia's network

Random Wiki