# Conditional probability distribution

34,135pages on
this wiki

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

 This article seems to be biased or has no references. You can help the Psychology Wiki by citing appropriate references. Please see the relevant discussion on the talk page.

Given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value. If the conditional distribution of Y given X is a continuous distribution, then its probability density function is known as the conditional density function.

The properties of a conditional distribution, such as the moments, are often called by corresponding names such as the conditional mean and conditional variance.

## Discrete distributions

For discrete random variables, the conditional probability mass function of Y given (the occurrence of) the value x of X, can be written, using the definition of conditional probability, as:

$p_Y(y\mid X = x)=P(Y = y \mid X = x) = \frac{P(X=x\ \cap Y=y)}{P(X=x)}.$

As seen from the definition, and due to its occurrence, it is necessary that $P(X=x) > 0.$

The relation with the probability distribution of X given Y is:

$P(Y=y \mid X=x) P(X=x) = P(X=x\ \cap Y=y) = P(X=x \mid Y=y)P(Y=y).$

## Continuous distributions

Similarly for continuous random variables, the conditional probability density function of Y given (the occurrence of) the value x of X, can be written as

$f_Y(y \mid X=x) = \frac{f_{X, Y}(x, y)}{f_X(x)},$

where fX,Y(x, y) gives the joint density of X and Y, while fX(x) gives the marginal density for X. Also in this case it is necessary that $f_X(x)>0$.

The relation with the probability distribution of X given Y is given by:

$f_Y(y \mid X=x)f_X(x) = f_{X,Y}(x, y) = f_X(x \mid Y=y)f_Y(y).$

The concept of the conditional distribution of a continuous random variable is not as intuitive as it might seem: Borel's paradox shows that conditional probability density functions need not be invariant under coordinate transformations.

## Relation to independence

Random variables X, Y are independent if and only if the conditional distribution of Y given X is equal to the unconditional distribution of Y. For discrete random variables: P(Y = y | X = x) = P(Y = y) for all relevant x and y. For continuous random variables having a joint density: fY(y | X=x) = fY(y) for all relevant x and y.

## Properties

Seen as a function of y for given x, P(Y = y | X = x) is a probability and so the sum over all y (or integral if it is a conditional probability density) is 1. Seen as a function of x for given y, it is a likelihood function, so that the sum over all x need not be 1.