Inferential statistics

34,190pages on
this wiki

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Inferential statistics or statistical induction comprises the use of statistics to make inferences concerning some unknown aspect (usually a parameter) of a population.

Two schools of inferential statistics are frequency probability using maximum likelihood estimation, and Bayesian inference. The following is an example of the latter.

Deduction and inductionEdit

From a population containing N items of which I are special, a sample containing n items of which i are special can be chosen in

${I \choose i}{{N-I} \choose {n-i}}$

ways (see multiset and binomial coefficient).

Fixing (N,n,I), this expression is the unnormalized deduction distribution function of i.

Fixing (N,n,i) , this expression is the unnormalized induction distribution function of I.

Mean ± standard deviationEdit

The mean value ± the standard deviation of the deduction distribution is used for estimating i knowing (N,n,I)

$i \approx f(N,n,I)$

where

$f(N,n,I)=\frac{nI\pm\sqrt{\frac{nI(N-n)(N-I)}{N-1}}}{N}.$

The mean value ± the standard deviation of the induction distribution is used for estimating I knowing (N,n,i)

$I \approx -1-f(-2-n,-2-N,-1-i).$

Thus deduction is translated into induction by means of the involution

$(N,n,I,i) \leftrightarrow (-2-n,-2-N,-1-i,-1-I).$

ExampleEdit

The population contains a single item and the sample is empty. (N,n,i)=(1,0,0). The induction formula gives

$I\approx -1-f(-2,-3,-1)=\frac{1}{2}\pm\frac{1}{2}$

confirming that the number of special items in the population is either 0 or 1.

(The frequency probability solution to this problem is $I\approx \frac{Ni}{n}=\frac{0}{0}$ giving no meaning.)

Limiting casesEdit

Binomial and BetaEdit

In the limiting case where N is a large number, the deduction distribution of i tends towards the binomial distribution with the probability $P=\frac{I}{N}$ as a parameter,

$i\approx nP\left (1\pm\sqrt{\frac{\frac{1}{P}-1}{n}}\right )$

and the induction distribution of $\ P$ tends towards the beta distribution

$P\approx\frac{i+1\pm\sqrt{\frac{(i+1)(n-i+1)}{n+3}}}{n+2}.$

(The frequency probability solution to this problem is $P \approx \frac{i}{n}$: the probability is estimated by the relative frequency.)

ExampleEdit

The population is big and the sample is empty. n=i=0. The beta distribution formula gives $P \approx(50 \pm 29)\%$.

(The frequency probability solution to this problem is $P \approx \frac{i}{n}=\frac{0}{0}$ giving no meaning.)

Poisson and GammaEdit

In the limiting case where $\frac{N}{n}$ and $\ n$ are large numbers, the deduction distribution of i tends towards the poisson distribution with the intensity $M=\frac{nI}{N}$ as a parameter,

$i \approx M \pm \sqrt{M}$

and the induction distribution of M tends towards the gamma distribution

$M \approx i+1 \pm \sqrt{i+1}.$

ExampleEdit

The population is big and the sample is big but contains no special items. i = 0. The gamma distribution formula gives $M\approx 1 \pm 1$.

(The frequency probability solution to this problem is $M\approx 0$ which is misleading. Even if you have not been wounded you may still be vulnerable).