Wikia

Psychology Wiki

Changes: Conditional probability distribution

Edit

Back to page

(Created page with "{{StatsPsy}} {{unreferenced|date=March 2009}} Given two jointly distributed random variables ''X'' and ''Y'', the '''conditional probability distribution''' of ''Y'' given ''...")
 
 
Line 1: Line 1:
 
{{StatsPsy}}
 
{{StatsPsy}}
{{unreferenced|date=March 2009}}
+
In [[probability theory]] and [[statistics]], given two jointly distributed [[random variable]]s ''X'' and ''Y'', the '''conditional probability distribution''' of ''Y'' given ''X'' is the [[probability distribution]] of ''Y'' when ''X'' is known to be a particular value; in some cases the conditional probabilities may be expressed as functions containing the unspecified value ''x'' of ''X'' as a parameter. The conditional distribution contrasts with the [[marginal distribution]] of a random variable, which is its distribution without reference to the value of the other variable.
Given two jointly distributed [[random variable]]s ''X'' and ''Y'', the '''conditional probability distribution''' of ''Y'' given ''X'' is the [[probability distribution]] of ''Y'' when ''X'' is known to be a particular value. If the conditional distribution of ''Y'' given ''X'' is a [[continuous distribution]], then its [[probability density function]] is known as the '''conditional density function'''.
 
   
The properties of a conditional distribution, such as the moments, are often called by corresponding names such as the [[conditional mean]] and [[conditional variance]].
+
If the conditional distribution of ''Y'' given ''X'' is a [[continuous distribution]], then its [[probability density function]] is known as the '''conditional density function'''. The properties of a conditional distribution, such as the [[Moment (mathematics)|moments]], are often referred to by corresponding names such as the [[conditional mean]] and [[conditional variance]].
  +
  +
More generally, one can refer to the conditional distribution of a subset of a set of more than two variables; this conditional distribution is contingent on the values of all the remaining variables, and if more than one variable is included in the subset then this conditional distribution is the conditional [[joint distribution]] of the included variables.
   
 
==Discrete distributions==
 
==Discrete distributions==
For [[discrete random variable]]s, the [[conditional probability]] mass function of ''Y'' given (the occurrence of) the value ''x'' of ''X'', can be written, using the definition of [[conditional probability]], as:
+
For [[discrete random variable]]s, the [[conditional probability]] mass function of ''Y'' given the occurrence of the value ''x'' of ''X'' can be written according to its definition as:
   
 
:<math>p_Y(y\mid X = x)=P(Y = y \mid X = x) = \frac{P(X=x\ \cap Y=y)}{P(X=x)}.</math>
 
:<math>p_Y(y\mid X = x)=P(Y = y \mid X = x) = \frac{P(X=x\ \cap Y=y)}{P(X=x)}.</math>
   
As seen from the definition, and due to its occurrence, it is necessary that <math>P(X=x) > 0.</math>
+
Due to the occurrence of <math>P(X=x)</math> in a denominator, this is defined only for non-zero (hence strictly positive) <math>P(X=x).</math>
   
 
The relation with the probability distribution of ''X'' given ''Y'' is:
 
The relation with the probability distribution of ''X'' given ''Y'' is:
Line 16: Line 16:
   
 
==Continuous distributions==
 
==Continuous distributions==
Similarly for [[continuous random variable]]s, the conditional [[probability density function]] of ''Y'' given (the occurrence of) the value ''x'' of ''X'', can be written as
+
Similarly for [[continuous random variable]]s, the conditional [[probability density function]] of ''Y'' given the occurrence of the value ''x'' of ''X'' can be written as
   
 
:<math>f_Y(y \mid X=x) = \frac{f_{X, Y}(x, y)}{f_X(x)}, </math>
 
:<math>f_Y(y \mid X=x) = \frac{f_{X, Y}(x, y)}{f_X(x)}, </math>
Line 28: Line 28:
   
 
==Relation to independence==
 
==Relation to independence==
Random variables ''X'', ''Y'' are [[Statistical independence|independent]] if and only if the conditional distribution of ''Y'' given ''X'' is equal to the unconditional distribution of ''Y''. For discrete random variables: ''P''(''Y'' = ''y'' | ''X'' = ''x'') = ''P''(''Y'' = ''y'') for all relevant ''x'' and ''y''. For continuous random variables having a joint density: ''f''<sub>''Y''</sub>(''y'' | ''X=x'') = ''f''<sub>''Y''</sub>(''y'') for all relevant x and y.
+
Random variables ''X'', ''Y'' are [[Statistical independence|independent]] if and only if the conditional distribution of ''Y'' given ''X'' is, for all possible realizations of ''X'', equal to the unconditional distribution of ''Y''. For discrete random variables this means ''P''(''Y'' = ''y'' | ''X'' = ''x'') = ''P''(''Y'' = ''y'') for all relevant ''x'' and ''y''. For continuous random variables ''X'' and ''Y'', having a [[joint density function]], it means ''f''<sub>''Y''</sub>(''y'' | ''X=x'') = ''f''<sub>''Y''</sub>(''y'') for all relevant x and y.
   
 
==Properties==
 
==Properties==
 
Seen as a function of ''y'' for given ''x'', ''P''(''Y'' = ''y'' | ''X'' = ''x'') is a probability and so the sum over all ''y'' (or integral if it is a conditional probability density) is 1. Seen as a function of ''x'' for given ''y'', it is a [[likelihood function]], so that the sum over all ''x'' need not be 1.
 
Seen as a function of ''y'' for given ''x'', ''P''(''Y'' = ''y'' | ''X'' = ''x'') is a probability and so the sum over all ''y'' (or integral if it is a conditional probability density) is 1. Seen as a function of ''x'' for given ''y'', it is a [[likelihood function]], so that the sum over all ''x'' need not be 1.
  +
  +
==Measure-Theoretic Formulation==
  +
Let <math>(\Omega, \mathcal{F}, P)</math> be a probability space, <math>\mathcal{G} \subseteq \mathcal{F}</math> a <math>\sigma</math>-field in <math>\mathcal{F}</math>, and <math>X : \Omega \to \mathbb{R}</math> a real-valued random variable (measurable with respect to the Borel <math>\sigma</math>-field <math>\mathcal{R}^1</math> on <math>\mathbb{R}</math>). It can be shown that there exists<ref>[[#billingsley95|Billingsley (1995)]], p. 439</ref> a function <math>\mu : \mathcal{R}^1 \times \Omega \to \mathbb{R}</math> such that <math>\mu(\cdot, \omega)</math> is a probability measure on <math>\mathcal{R}^1</math> for each <math>\omega \in \Omega</math> and <math>\mu(H, \cdot) = P(X \in H | \mathcal{G})</math> (almost surely) for every <math>H \in \mathcal{R}^1</math>. For any <math>\omega \in \Omega</math>, the function <math>\mu(\cdot, \omega) : \mathcal{R}^1 \to \mathbb{R}</math> is called a '''conditional probability distribution''' of <math>X</math> given <math>\mathcal{G}</math>. In this case,
  +
:<math>E[X | \mathcal{G}] = \int_{-\infty}^\infty x \, \mu(d x, \cdot)</math>
  +
almost surely.
   
 
== See also ==
 
== See also ==
Line 37: Line 42:
 
*[[Conditional probability]]
 
*[[Conditional probability]]
 
*[[Regular conditional probability]]
 
*[[Regular conditional probability]]
  +
  +
==Notes==
  +
{{reflist}}
  +
  +
==References==
  +
*{{cite book
  +
| author = [[Patrick Billingsley]]
  +
| title = Probability and Measure, 3rd ed.
  +
| publisher = John Wiley and Sons
  +
| location = New York, Toronto, London
  +
| year = 1995
  +
| ref = billingsley95}}
  +
   
 
[[Category:Probability theory]]
 
[[Category:Probability theory]]

Latest revision as of 19:43, July 2, 2013

Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory


In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the conditional probabilities may be expressed as functions containing the unspecified value x of X as a parameter. The conditional distribution contrasts with the marginal distribution of a random variable, which is its distribution without reference to the value of the other variable.

If the conditional distribution of Y given X is a continuous distribution, then its probability density function is known as the conditional density function. The properties of a conditional distribution, such as the moments, are often referred to by corresponding names such as the conditional mean and conditional variance.

More generally, one can refer to the conditional distribution of a subset of a set of more than two variables; this conditional distribution is contingent on the values of all the remaining variables, and if more than one variable is included in the subset then this conditional distribution is the conditional joint distribution of the included variables.

Discrete distributionsEdit

For discrete random variables, the conditional probability mass function of Y given the occurrence of the value x of X can be written according to its definition as:

p_Y(y\mid X = x)=P(Y = y \mid X = x) = \frac{P(X=x\ \cap Y=y)}{P(X=x)}.

Due to the occurrence of P(X=x) in a denominator, this is defined only for non-zero (hence strictly positive) P(X=x).

The relation with the probability distribution of X given Y is:

P(Y=y \mid X=x) P(X=x) = P(X=x\ \cap Y=y) = P(X=x \mid Y=y)P(Y=y).

Continuous distributionsEdit

Similarly for continuous random variables, the conditional probability density function of Y given the occurrence of the value x of X can be written as

f_Y(y \mid X=x) = \frac{f_{X, Y}(x, y)}{f_X(x)},

where fX,Y(x, y) gives the joint density of X and Y, while fX(x) gives the marginal density for X. Also in this case it is necessary that f_X(x)>0.

The relation with the probability distribution of X given Y is given by:

f_Y(y \mid X=x)f_X(x) = f_{X,Y}(x, y) = f_X(x \mid Y=y)f_Y(y).

The concept of the conditional distribution of a continuous random variable is not as intuitive as it might seem: Borel's paradox shows that conditional probability density functions need not be invariant under coordinate transformations.

Relation to independenceEdit

Random variables X, Y are independent if and only if the conditional distribution of Y given X is, for all possible realizations of X, equal to the unconditional distribution of Y. For discrete random variables this means P(Y = y | X = x) = P(Y = y) for all relevant x and y. For continuous random variables X and Y, having a joint density function, it means fY(y | X=x) = fY(y) for all relevant x and y.

PropertiesEdit

Seen as a function of y for given x, P(Y = y | X = x) is a probability and so the sum over all y (or integral if it is a conditional probability density) is 1. Seen as a function of x for given y, it is a likelihood function, so that the sum over all x need not be 1.

Measure-Theoretic FormulationEdit

Let (\Omega, \mathcal{F}, P) be a probability space, \mathcal{G} \subseteq \mathcal{F} a \sigma-field in \mathcal{F}, and X : \Omega \to \mathbb{R} a real-valued random variable (measurable with respect to the Borel \sigma-field \mathcal{R}^1 on \mathbb{R}). It can be shown that there exists[1] a function \mu : \mathcal{R}^1 \times \Omega \to \mathbb{R} such that \mu(\cdot, \omega) is a probability measure on \mathcal{R}^1 for each \omega \in \Omega and \mu(H, \cdot) = P(X \in H | \mathcal{G}) (almost surely) for every H \in \mathcal{R}^1. For any \omega \in \Omega, the function \mu(\cdot, \omega) : \mathcal{R}^1 \to \mathbb{R} is called a conditional probability distribution of X given \mathcal{G}. In this case,

E[X | \mathcal{G}] = \int_{-\infty}^\infty x \, \mu(d x, \cdot)

almost surely.

See also Edit

NotesEdit

ReferencesEdit

  • Patrick Billingsley (1995). Probability and Measure, 3rd ed., New York, Toronto, London: John Wiley and Sons.


This page uses Creative Commons Licensed content from Wikipedia (view authors).

Around Wikia's network

Random Wiki