History Report a problem
Article Edit this page Discussion

Coefficient of determination

From Psychology Wiki

Jump to: navigation, search

Community portal · Tasks to do · News · Help

Clinical · Educational · Ind&Org · Other fields · Professional · Transpersonal · World

Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Personality | Philosophy | Research Methods | Social | Statistics

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory


In statistics, the coefficient of determination R2 is the proportion of variability in a data set that is accounted for by a statistical model. There are several common and equivalent expressions for R2. The version most common in statistics texts is based on an analysis of variance decomposition as follows:

math

In the above definition,

math

That is, math is the total sum of squares, math is the explained sum of squares, and math is the residual sum of squares.

Contents

[edit] Explanation and interpretation of R2

For expository purposes, consider a linear model of the form

math

where Yi is the response variable, math are unknown coefficients; math are p regressors, and math is a mean zero error term. The coefficient of determination R2 is a measure of the global fit of the model. Specifically, math is an element of [0,1] and represents the proportion of variability in Yi that may be attributed to some linear combination of the regressors (explanatory variables) in X.

More simply, R2 is often interpreted as the proportion of response variation "explained" by the regressors in the model. Thus, math indicates that the fitted model explains all variability in math, while math indicates no 'linear' relationship between the response variable and regressors. An interior value such as math may be interpreted as follows: "Approximately seventy percent of the variation in the response variable can be explained by the explanatory variable. The remaining thirty percent can be explained by unknown, lurking variables or inherent variability."

If there is just one scalar-valued regressor, then math is the square of the correlation between the regressor and response variables. More generally, math is the square of the correlation between y and math.

[edit] Inflation of R2

In least squares regression, R2 is weakly increasing in the number of regressors in the model. As such, R2 cannot be used as a meaningful comparison of models with different numbers of covariants. As a reminder of this, some authors denote R2 by R2p, where p is the number of columns in X

Demonstration of this property is trivial. To begin, recall that the objective of least squares regression is (in matrix notation)

math

The optimal value of the objective is weakly smaller as additional columns of math are added, by the fact that relatively unconstrained minimization leads to a solution which is weakly smaller than relatively constrained minimization. Given the previous conclusion and noting that math depends only on y, the non-decreasing property of R2 follows directly from the definition above.

[edit] Adjusted R2

Adjusted R2 is a modification of R2 that adjusts for the number of explanatory terms in a model. Unlike R2, the adjusted R2 increases only if the new term improves the model more than would be expected by chance. The adjusted R2 can be negative, and will always be less than R2. The adjusted R2 is defined as

math

where p is the total number of regressors in the linear model, and n is sample size.

Adjusted R2 does not have the same interpretation as R2. As such, care must be taken in interpreting and reporting this statistic. Adjusted R2 is particularly useful in the Feature selection stage of model building.

[edit] Notes on interpreting R2

math does NOT tell whether:

[edit] External links

[edit] See also

de:Bestimmtheitsmaß
es:Coeficiente_de_Determinacíon
pt:Coeficiente de determinação
Smallwikipedialogo.png This page uses content from the English-language version of Wikipedia. The original article was at Coefficient of determination. The list of authors can be seen in the page history. As with Psychology Wiki, the text of Wikipedia is available under the GNU Free Documentation License.

Rate this article:

Share this article:

Hubs Highlights International Sites Wikia messages
Entertainment
Gaming
Cartoons & Comics
Science Fiction
Hobbies
Sports
See all...
Grand Theft Auto
Doctor Who
Legend of Zelda Wiki
Terminator Wiki
Everquest II Wiki
Mystery Science Theater 3000
German
Spanish
Chinese
Japanese
More...
Wikia is hiring for several open positions
Send this article to a friend
"Coefficient of determination"
 
 
Hi!

I thought you'd like this page from Wikia!

http://psychology.wikia.com

Come check it out!
Send confirmation


.