Wikia

Psychology Wiki

Mean squared prediction error

Talk0
34,142pages on
this wiki

Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory


In statistics the mean squared prediction error of a smoothing procedure is the expected sum of squared deviations of the fitted values \widehat{g} from the (unobservable) function g. If the smoothing procedure has operator matrix L, then

\operatorname{MSPE}(L)=\operatorname{E}\left[\sum_{i=1}^n\left( g(x_i)-\widehat{g}(x_i)\right)^2\right].

The MSPE can be decomposed into two terms just like mean squared error is decomposed into bias and variance; however for MSPE one term is the sum of squared biases of the fitted values and another the sum of variances of the fitted values:

\operatorname{MSPE}(L)=\sum_{i=1}^n\left(\operatorname{E}\left[\widehat{g}(x_i)\right]-g(x_i)\right)^2+\sum_{i=1}^n\operatorname{var}\left[\widehat{g}(x_i)\right].

Note that knowledge of g is required in order to calculate MSPE exactly.

Estimation of MSPEEdit

For the model y_i=g(x_i)+\sigma\varepsilon_i where \varepsilon_i\sim\mathcal{N}(0,1), one may write

\operatorname{MSPE}(L)=g'(I-L)'(I-L)g+\sigma^2\operatorname{tr}\left[L'L\right].

The first term is equivalent to

\sum_{i=1}^n\left(\operatorname{E}\left[\widehat{g}(x_i)\right]-g(x_i)\right)^2
=\operatorname{E}\left[\sum_{i=1}^n\left(y_i-\widehat{g}(x_i)\right)^2\right]-\sigma^2\operatorname{tr}\left[\left(I-L\right)'\left(I-L\right)\right].

Thus,

\operatorname{MSPE}(L)=\operatorname{E}\left[\sum_{i=1}^n\left(y_i-\widehat{g}(x_i)\right)^2\right]-\sigma^2\left(n-2\operatorname{tr}\left[L\right]\right).

If \sigma^2 is known or well-estimated by \widehat{\sigma}^2, it becomes possible to estimate MSPE by

\operatorname{\widehat{MSPE}}(L)=\sum_{i=1}^n\left(y_i-\widehat{g}(x_i)\right)^2-\widehat{\sigma}^2\left(n-2\operatorname{tr}\left[L\right]\right).

Colin Mallows advocated this method in the construction of his model selection statistic Cp, which is a normalized version of the estimated MSPE:

C_p=\frac{\sum_{i=1}^n\left(y_i-\widehat{g}(x_i)\right)^2}{\widehat{\sigma}^2}-n+2\operatorname{tr}\left[L\right].

where p comes from that fact that the number of parameters p estimated for a parametric smoother is given by p=\operatorname{tr}\left[L\right], and C is in honor of Cuthbert Daniel.

See alsoEdit

This page uses Creative Commons Licensed content from Wikipedia (view authors).

Around Wikia's network

Random Wiki