# Experimental design

34,200pages on
this wiki

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory

This article is in need of attention from a psychologist/academic expert on the subject.
Please help recruit one, or improve this page yourself if you are qualified.
This banner appears on articles that are weak and whose contents should be approached with academic caution
.

The first statistician to consider a methodology for the design of experiments was Sir Ronald A. Fisher. He described how to test the hypothesis that a certain lady could distinguish by flavor alone whether the milk or the tea was first placed in the cup. While this sounds like a frivolous application, it allowed him to illustrate the most important means of experimental design:

Analysis of the design of experiments was built on the foundation of the analysis of variance, a collection of models in which the observed variance is partitioned into components due to different factors which are estimated and/or tested.

Some efficient designs for estimating several main effects simultaneously were found by Raj Chandra Bose and K. Kishen in 1940 at the Indian Statistical Institute, but remained little known until the Plackett-Burman designs were published in Biometrika in 1946.

In 1950, Gertrude Mary Cox and William Cochran published the book Experimental Design which became the major reference work on the design of experiments for statisticians for years afterwards.

Developments of the theory of linear models have encompassed and surpassed the cases that concerned early writers. Today, the theory rests on advanced topics in abstract algebra and combinatorics.

As with all other branches of statistics, there is both classical and Bayesian experimental design.

## Example

This example is attributed to Harold Hotelling in [1]. Although very simple, it conveys at least some of the flavor of the subject.

The weights of eight objects are to be measured using a pan balance that measures the difference between the weight of the objects in the two pans. Each measurement has a random error. The average error is zero; the standard deviations of the probability distribution of the errors is the same number σ on different weighings; and errors on different weighings are independent. Denote the true weights by

$\theta_1, \dots, \theta_8.\,$

We consider two different experiments:

• Weigh each object in one pan, with the other pan empty. Call the measured weight of the ith object Xi for i = 1, ..., 8.
• Do the eight weighings according to the following schedule and let Yi be the measured difference for i = 1, ..., 8:
$\begin{matrix} & \mbox{left pan} & \mbox{right pan} \\ \mbox{1st weighing:} & 1\ 2\ 3\ 4\ 5\ 6\ 7\ 8 & \\ \mbox{2nd:} & 1\ 2\ 3\ 8\ & 4\ 5\ 6\ 7 \\ \mbox{3rd:} & 1\ 4\ 5\ 8\ & 2\ 3\ 6\ 7 \\ \mbox{4th:} & 1\ 6\ 7\ 8\ & 2\ 3\ 4\ 5 \\ \mbox{5th:} & 2\ 4\ 6\ 8\ & 1\ 3\ 5\ 7 \\ \mbox{6th:} & 2\ 5\ 7\ 8\ & 1\ 3\ 4\ 6 \\ \mbox{7th:} & 3\ 4\ 7\ 8\ & 1\ 2\ 5\ 6 \\ \mbox{8th:} & 3\ 5\ 6\ 8\ & 1\ 2\ 4\ 7 \end{matrix}$
Then the estimated value of the weight θ1 is
$\widehat{\theta}_1 = \frac{Y_1 + Y_2 + Y_3 + Y_4 - Y_5 - Y_6 - Y_7 - Y_8}{8}.$

The question of design of experiments is: which experiment is better?

The variance of the estimate X1 of θ1 is σ2 if we use the first experiment. But if we use the second experiment, the variance of the estimate given above is σ2</sub>/8. Thus the second experiment gives us 8 times as much precision.

Many problems of the design of experiments involve combinatorial designs, as in this example.

## Types of design

Some of the most popular designs are sorted below, with the ones at the top being the most powerful at reducing observer-expectancy effect but also most expensive, and in some cases introducing ethical concerns. The ones at the bottom are the most affordable, and are frequently used earlier in the research cycle, to develop strong hypotheses worth testing with the more expensive research approaches.

### Descriptive

• Community survey

## Ordering of conditions

An important aspect of some experiment designs is the ordering of different experimental conditions.

## Important considerations

When choosing a study design, many factors must be taken into account. Different types of studies are subject to different types of bias. For example, recall bias is likely to occur in cross-sectional or case-control studies where subjects are asked to recall exposure to risk factors. Subjects with the relevant condition (e.g. breast cancer) may be more likely to recall the relevant exposures that they had undergone (e.g. hormone replacement therapy) than subjects who don't have the condition.

The ecological fallacy may occur when analyses are done on ecological (group-based) data rather than individual data. The nature of this type of analysis tends to overestimate the degree of association between variables.