# Variance homogeneity

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
34,196pages on
this wiki

In statistics, a sequence or a vector of random variables is homoscedastic if all random variables in the sequence or vector have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The alternative spelling homo- or heteroskedasticity is also used frequently.

The assumption of homoscedasticity simplifies mathematical and computational treatment. Serious violations in homoscedasticity (assuming a distribution of data is homoscedastic when in actuality it is heteroscedastic) result in overestimating the goodness of fit as measured by the Pearson coefficient.

## Assumptions of a regression modelEdit

As used in describing simple linear regression analysis, one assumption of the fitted model (to ensure that the least-squares estimators are each a best linear unbiased estimator of the respective population parameters, by the Gauss–Markov theorem) is that the standard deviations of the error terms are constant and do not depend on the x-value. Consequently, each probability distribution for y (response variable) has the same standard deviation regardless of the x-value (predictor). In short, this assumption is homoscedasticity. Homoscedasticity is not required for the estimates to be unbiased, consistent, and asymptotically normal.

## TestingEdit

Residuals can be tested for homoscedasticity using the Breusch–Pagan test, which regresses square residuals to independent variables. The BP test is sensitive to normality so for general purpose the Koenker–Basset or generalized Breusch–Pagan test statistic is used. For testing for groupwise heteroscedasticity, the Goldfeld–Quandt test is needed.

## Homoscedastic distributionsEdit

Two or more normal distributions, $N(\mu_i,\Sigma_i)$, are homoscedastic if they share a common covariance (or correlation) matrix, $\Sigma_i = \Sigma_j,\ \forall i,j$. Homoscedastic distributions are especially useful to derive statistical pattern recognition and machine learning algorithms. One popular example is Fisher's linear discriminant analysis.

A more general definition is spherical homoscedastic distributions.