Psychology Wiki
Advertisement

Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language | Individual differences | Personality | Philosophy | Social |
Methods | Statistics | Clinical | Educational | Industrial | Professional items | World psychology |

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory


There are two articles here that need merging

In statistics, analysis of variance (ANOVA) is a collection of statistical models and their associated procedures which compare means by splitting the overall observed variance into different parts. The initial techniques of the analysis of variance were pioneered by the statistician] and geneticist Ronald Fisher in the 1920s and 1930s, and is sometimes known as Fisher's ANOVA or Fisher's analysis of variance.

Overview

There are three conceptual classes of such models:

  • Fixed-effects model assumes that the data come from normal populations which differ in their means.
  • Random-effects models assume that the data describe a hierarchy of different populations whose differences are constrained by the hierarchy.
  • Mixed models describe situations where both fixed and random effects are present.

The fundamental technique is a partitioning of the total sum of squares into components related to the effects in the model used. For example, we show the model for a simplified ANOVA with one type of treatment at different levels. (If the treatment levels are quantitative and the effects are linear, a linear regression analysis may be appropriate.)

The number of degrees of freedom (abbreviated df) can be partitioned in a similar way and specifies the chi-square distribution which describes the associated sums of squares.

Fixed-effects model

The fixed-effects model of analysis of variance applies to situations in which the experimenter has subjected his experimental material to several treatments, each of which affects only the mean of the underlying normal distribution of the "response variable".

Random-effects model

Random effects models are used to describe situations in which incomparable differences in experimental material occur. The simplest example is that of estimating the unknown mean of a population whose individuals differ from each other. In this case, the variation between individuals is confounded with that of the observing instrument. ...

Degrees of freedom

Degrees of freedom indicates the effective number of observations which contribute to the sum of squares in an ANOVA, the total number of observations minus the number of linear constraints in the data...

Tests of significance

Analyses of variance lead to tests of statistical significance using Fisher's F-distribution.

See also

External links

de:Varianzanalyse es:Análisis de varianza fr:Analyse de la variance gl:Análise da varianza nl:Variantie-analyse sl:Analiza variance su:Analisa varian

Introduction

In our previous chapters we explored the use of using a single variable in research; however, much of the research done in psychology involves the use of several variables. This is because there are few instances where researchers can use a single variable to explain human behaviors. Our previously learned material covered using one independent variable. In reality, more often research questions will involve the use of more then one independent variable.

This chapter will explore the use of the two variables between-subject design, and the statistical method used to measure this type of design is known as the two-way ANOVA.

Advantage of the Two-variable Design

In research using a two-variable design offers many advantages over using a one-variable design. The first advantage is increased efficiency. This is because the two-variable design contains all of the elements of using two, one-variable designs. From this, using one, two variable design is more cost-effective than researching two, one-variable design experiments.

Another advantage is that we can analyze the interaction of the two variables in the design. This helps use understand how combinations of variables influence behavior. In particular, it allows us to understand and analyze the interactive effects between the two independent variables on the dependent variable. In this, interaction means that the effect of one independent variable is influenced by another independent variable; or, interaction means that the relationship between an independent variable is different at various levels (types) of another independent variable.

For example, the researchers Cohen, Nisbett, Bowdle, and Schwaz (1996) conducted an experiment where they examined the reaction of white male participants who had just been insulted versus those who had not been insulted using males from the Northern and Southern regions of the United States. They measured there testosterone level creating an operational definition for their level of aggression. This is because testosterone is easily measured through saliva samples and correlates with arousal, especially aggression. The hypothesis is that the participants who had been insulted would show a higher level of testosterone than do participants who had not been insulted; however, there was a second independent variable: regional background of the participants. In fact, the researchers selected half of the participants from Northern regions and the other from Southern. The researchers thought that the men from the South, who had a "culture of honor”, would show a greater increase in testosterone levels than the males from the North. This is because men in the South are raised to protect their character when attacked through insults or violence, whereas men in the North are not so they thought that a man from the South being insulted would show more arousal.

This example shows how the variable of culture (Northern or Southern male) influences the variable aggression (in the form of testosterone levels), and it also shows how the being insulted affects aggression levels. There can be an interaction between the two variables if the effect of one variable is not consistent across all levels of the other variables. For example, if the results of the study showed that the Northern males’ testosterone levels did not rise when insulted but the Southern males did, there would be interaction. Remember, it is interaction between the Independent variables; therefore, the variables of being a Southern male and being insulted interact with each other to produce the effect of increased aggression, while being a Northern male and being insulted produces a different interaction as to prevent increased aggression.

The last advantage of using a two-variable design ANOVA is an increase in statistical power. If you recall, power is the ability to confidently reject a false NULL hypothesis. This type of research design increases statistical power because the within groups variance tends to be smaller than the within-group variance of a comparable one-variable study (two, one-way ANOVA's). If you recall, the smaller the variance the less fluctuation in measure; therefore, the smaller the F-ratio; therefore, the smaller the confidence interval which means that we are more likely to have chosen a smaller range of possible values which, in turn, restricts the range of possible values for statistical significance; thus, greater statistical power in correctly rejecting a false NULL hypothesis.

Summary

The advantages of using a two-variable desing via Two-Way ANOVA:

  • Decrease in cost
  • The ability to analyze the interaction of two independant variables
  • Increased statistical power due to smaller variance

The Logic of Two-Variable Design

The first concept to consider with a two-variable design is the concept of a treatment combination. Initially, when we deisng a two-variable study, we select the number of levels we want to use for each variable. Because we combine the two variable into one study, we create something call a factorial design. A Factorial design represents a study that includes an independent group for each possible combination of levels for the independent variable. For example, in the Cohen et al. (1996) experiment, there are two levels of the insult condition and two levels of participant background. Because of this, the study uses a 2*2 factorial design. This design creates 4 independent groups, because it takes each level of each independent variable and multiplies them. In the Cohen experiment, there are 2 levels for the independent variable insult (control, and insulted) and two levels for the indenpendent variable culture (Northern or Southern); therefore, there are 2 independent variable with 2 levels each. Thus, it is a 2*2 factorial design. An example below:

Cohen 2

Each combination of levels for the independent variable creates a treatment condition or cell. A 2*2 factorial design creates 4 treatment conditions or cells. If a study has 3 levels of one independent variable and 4 levels of another independent variable, then this study creates 12 treatment conditions or cells.

When both variables are between-subjects variables, we can conclude that the individual cells are independent groups. Independence means that the data collected for one cell do not correlate with the other cells. B.General linear model

When we discussed the one-way ANOVA, we learned that the logic under one-way ANOVA is the general linear model. We learned that each observation is the sum of the baseline grand mean plus treatment effect and plus the random error within group.

Xij= μ + αj + εij

The logic of the two-way ANOVA is also the general linear model. However, in the general linear model for the two-way ANOVA, there are two more components:

Xijk= μ + αj + βk + αβjk + εijk

That is, individual observation Xijk is the sum of the baseline grand mean, the effects of independent variable A at j level, the effects of independent variable B at k level, the joint effects of A at j level and B at k level, and the random error within the group of the combination of A at j level and B at k level.

Cohen 2

For example, for the first subject in the first cell (X1jk), the observed score would be the sum of the grand mean (M), the difference between mean score for all subjects in the control group (Ma1) and the grand mean (M), the difference between mean score for all subjects from northern (Mb1) and the grand mean (M), the difference between the mean score in the first cell (Ma1b1) and the mean scores of the two independent variables at particular levels, and the difference between the observed score and the mean score in the first cell (which represents the within group random error).

X1jk = M + (Ma1-M) + (Mb1-M) + (Ma1b1- Ma1- Mb1+M) + (X1jk - Ma1b1)

C. Components of sum of square

If we rearrange the above equation, we get:

X1jk - M= (Ma1-M) + (Mb1-M) + (Ma1b1- Ma1- Mb1+M) + (X1jk - Ma1b1)

We can see that the deviation of each observation from the grand mean is sum of the deviation of the mean score of the first independent variable at one particular level from the grand mean, the deviation of the mean score of the second independent variable at one particular level from the grand mean, the deviation of the mean score of the combination of two independent variables from the mean scores of the two independent variables at particular levels, and the random error.

If we square all the parts of the equation and sum the deviations for all the subjects, we get:

SStotal=SSA+SSB+SSAB+SSwithin

The above equation indicates that the sum of square of total can be decomposed into four parts, the sum of square between different levels of the first independent variable, the sum of square between different levels of the second independent variable, the sum of square between different combinations of the two independent variables (that is, between different cells), and sum of square within groups.

D. Components of variance

If we divide the sum of square with corresponding degree of freedom, we get five types of variance. Total variance is the sum of square divided by total degree of freedom (N-1). Total variance is also called mean square of total. The variance due to an independent variable is the sum of square of the independent variable divided by degree of freedom for this variable. The degree of freedom for an independent variable is the number of levels of the independent variable minus 1. Therefore, the variance due to insult condition in the above example is SSA/(2-1) (dfA=j-1). The variance due to participant background is SSB/(2-1) (dfB=k-1).

The variance due to the combination (or interaction) of the two independent variable is the sum of square of combinations divided by the interaction degree of freedom which is the product of two degrees of freedom of two independent variable, dfAB=dfA*dfB=(j-1)*(k-1).

The variance within is the sum of square of within divided by the degree of freedom within which is N-j*k.

Main effects and interaction

A.Main effects A main effect refers to the effect that one independent variable has on the dependent variable holding the effects of the other variables constant. Specifically, a main effect represents a special form of the between-groups variance of a single-independent variable. In a two-factor ANOVA, there are two main effects, one for each factor. When we examine the data using an ANOVA, each main effect can be either statistically significant or not statistically significant. Consequently, there are four potential patterns of results:

1)A statistically significant main effect for Factor A but not for Factor B

Sign maineffect no interaction

2)A statistically significant main effect for Factor B but not for Factor A

Sig fact b not a

3)Statistically significant main effects for both factors

The two lines are parallel and there is no interaction between the two independent variables. The relationship between on independent variable, for example, the relationship between insult condition and reaction, is not different at different level of the second variable, the backgrounds of the participants in this example. 4)No statistically significant main effects for both factors


B.Interaction An interaction indicates that the effect of one variable is not consistent across all levels of the other variables. That is, the relationship between one independent variable is different at different levels of the other variable. All the above figures indicate situations of no interaction. When there is no interaction, the two lines are parallel. When there is interaction, the two will not be parallel. For a 2*2 factorial design, there are four possible interactions: 1) There is a statistically significant main effect for Factor A, the insult condition. But there is not statistically significant main effect for Factor B.

Cohen 3


There is a main effect for Factor A, the insult condition. That is, without considering the background difference, there is significant difference between the control group (M=4.5) and the insult group (7.5). There is no significant main effect for Factor B, the background. That is, without considering the insult condition difference, there is no difference between the southerner (M=6.0) and the northerner (6.0). There is interaction between the two factors. The southerner had greatly elevated testosterone levels after they had been insulted. By contrast, the northern participants’ testosterone did not change across the two insult conditions. The hallmark of the interaction is that the two lines are not parallel.

2) There is statistically significant main effect for Factor B, but there is not statistically significant main effect for Factor A.

Cohen 4

There is a main effect for participant background. Without considering the insult condition difference, the southerners have higher testosterone level (M=7.5) than do the northerners (M=4.5). There is no main effect for the insult conditions. Without considering the background difference, the participants in the control group (M=6.0) have similar testosterone level as the participants in the insult group have (M=6.0). There is interaction between the two factors. The insult had opposite effects on the southern and northern participants. The insult caused the southern participant’s testosterone to increase whereas the testosterone for the northern participants decreased.

3) Both main effects are statistically significant and there is interaction.

Cohen 5

There is statistically significant main effect of Factor A. Without considering the background difference, there is significant difference between the control group (M=3.0) and the insult group (9.0). There is statistically significant main effect of Factor B. Without considering the insult background difference, the southerners have higher testosterone level (M=7.5) than do the northerners (M=4.5). There is statistically significant interaction between the two factors. The insult condition raised the testosterone levels. In addition to that, southern participants had, overall, greater increases in testosterone levels than did the northern participants. 4)Neither main effect is statistically significant, but the interaction is.

Cohen 6

There is no main effect for the insult conditions. Without considering the background difference, the participants in the control group (M=6.0) have similar testosterone level as the participants in the insult group have (M=6.0). There is no significant main effect for Factor B, the background. That is, without considering the insult condition difference, there is no difference between the southerner (M=6.0) and the northerner (6.0). However, there is statistically significant interaction. For southerners, insult increases the testosterone level. For northerners, insult decreases the testosterone level.

Advertisement