# Covariance

34,135pages on
this wiki

In probability theory and statistics, the covariance between two real-valued random variables X and Y, with expected values $E(X)=\mu$ and $E(Y)=\nu$ is defined as:

$\operatorname{cov}(X, Y) = \operatorname{E}((X - \mu) (Y - \nu)), \,$

where E is the expected value.

Intuitively, covariance is the measure of how much two variables vary together. That is to say, the covariance becomes more positive for each pair of values which differ from their mean in the same direction, and becomes more negative with each pair of values which differ from their mean in opposite directions. In this way, the more often they differ in the same direction, the more positive the covariance, and the more often they differ in opposite directions, the more negative the covariance.

The units of measurement of the covariance cov(X, Y) are those of X times those of Y. By contrast, the correlation, which depends on the covariance, is a dimensionless measure of linear dependence.

The definition above is equivalent to the following formula which is commonly used in calculations:

$\operatorname{cov}(X, Y) = \operatorname{E}(X Y) - \mu \nu. \,$

If X and Y are independent, then their covariance is zero. This follows because under independence,

$E(X \cdot Y)=E(X) \cdot E(Y)=\mu\nu,$

The converse, however, is not true: it is possible that X and Y are not independent, yet their covariance is zero. Random variables whose covariance is zero are called uncorrelated.

If X and Y are real-valued random variables and c is a constant ("constant", in this context, means non-random), then the following facts are a consequence of the definition of covariance:

$\operatorname{cov}(X, X) = \operatorname{var}(X)\,$
$\operatorname{cov}(X, Y) = \operatorname{cov}(Y, X)\,$
$\operatorname{cov}(cX, Y) = c\, \operatorname{cov}(X, Y)\,$
$\operatorname{cov}\left(\sum_i{X_i}, \sum_j{Y_j}\right) = \sum_i{\sum_j{\operatorname{cov}\left(X_i, Y_j\right)}}.\,$

For column-vector valued random variables X and Y with respective expected values μ and ν, and n and m scalar components respectively, the covariance is defined to be the n×m matrix

$\operatorname{cov}(X, Y) = \operatorname{E}((X-\mu)(Y-\nu)^\top).\,$

For vector-valued random variables, cov(X, Y) and cov(Y, X) are each other's transposes.

The covariance is sometimes called a measure of "linear dependence" between the two random variables. That phrase does not mean the same thing that it means in a more formal linear algebraic setting (see linear dependence), although that meaning is not unrelated. The correlation is a closely related concept used to measure the degree of linear dependence between two variables.de:Kovarianz (Stochastik) es:Covarianza fr:Covarianceno:Kovarianspt:Covariância su:Kovarian