# Association (statistics)

34,142pages on
this wiki

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In statistics, an association comes from two variables that are related. Many people confuse association with causation. Association does not imply causation.

For example, the United Nations did a study of government failure — when governments fall or are overthrown. The best indicator of a government about to fall was the infant mortality rate. The dead children do not cause the government to fall, rather they are joint effects of a common cause.

Another example is ice cream consumption and murder. The sales of ice cream and murder are strongly positively correlated. Which causes which; does eating ice cream cause murder or does murder make people eat ice cream? The answer is neither — increases in both ice cream consumption and murder correlate with hot weather.

Another perspective on the relationship between association and causality is that association does not imply a direct causal connection between the associated variables. If, however, association is nonrandom (i.e., not due purely to chance), then it implies that some causal mechanism is operative. Often, the nature of the causal mechanism underlying an association is the joint influence of one or more common causes operating on the variables in question. For example, both the increase in ice cream consumption and murder may occur during warm weather (a conclusion that would require further information to confirm or refute). If this were so, then the occurrence of the association between ice cream consumption and murder would be a manifestation of causation, but not in the simple, linear fashion that one initially might be tempted to assume. Associations of this sort, involving a third variable that jointly causes the association between the two original variables, is often termed "spurious association."

Several tests can be used to determine association. Computation of any of several versions of the correlation coefficient, the P test, t-test, and chi-squared test are the most common.