Education
 

Sample size

From Psychology Wiki

Community portal · Tasks to do · News · Help

Clinical · Educational · Ind&Org · Other fields · Professional · Transpersonal · World

Assessment | Biopsychology | Comparative | Cognitive | Developmental | Language
Personality | Philosophy | Research Methods | Social | Statistics

Statistics: Scientific method · Research methods · Experimental design · Undergraduate statistics courses · Statistical tests · Game theory · Decision theory


Sample size, usually designated N, is the number of repeated measurements in a statistical sample. They are used to estimate a parameter, a descriptive quantity of some population. N determines the precision of that estimate. Larger N gives smaller error bounds of estimation. A typical statement is to say that one can be 95% sure the true parameter is within +or- B of the estimate, where B is an error bound that decreases with increasing N. Such a bounded estimate is refered to as the confidence interval for that parameter.

For example, the simplest rule of thumb for estimating any parameter is the one for a proportion in a population. It is that the maximum bound, B, of a 95% confidence interval for an unknown proportion is 1/sqrt(N). So, N=100 gives B = 10%, N=400 gives B = 5%, N=1000 gives B = ~3%, and N=10000 gives B = 1%. One sees these numbers quoted often in news reports of opinion polls and other sample surveys.

For sufficient N, usually at least 30, the general 95% confidence interval for a population mean or "expected value" is the sample mean +or- B, where B = 2sqrt(V/N) and V is the variance of the sampled variable. Conversely N=4V/B2.

The rule of thumb for maximum B for a proportion derives from the fact that for sufficient N, the estimator of a proportion, X/N, has a binomial distribution and is also the sample mean from a Bernoulli distribution with maximum variance of .25, closely approximating a normal distribution which the Central Limit Theorem says contains ~95% of its values within 2 standard deviations of its population mean. One simply envisions those bounds being shifted from around the population mean to around its estimator. This maximum 95% error bound, twice the standard error of X/N, where X are N are yet to be determined, is B = 2sqrt(.25/N) = 1/sqrt(N). Conversely N=1/B2.

[edit] See also

Smallwikipedialogo.png This page uses content from the English-language version of Wikipedia. The original article was at Sample size. The list of authors can be seen in the page history. As with Psychology Wiki, the text of Wikipedia is available under the GNU Free Documentation License.