Introduction

The purpose of mathematical statistics is the determination of properties of a (usually large) population based on a so-called random sample. What we do is to pick one individual "at random" (that means that each individual has the same chance of being chosen) and record the value of data (e.g. length, height, weight, ...) associated with this individual. This value X is a random variable whose distribution is the (relative) frequency distribution of that value of data within the population. If we repeat this process n times, we get a random sample (X_1,\,...,\,X_n).

Sampling can be done with or without replacement which means that one certain individual can by chosen more than once or only once respectively. The former leads to a sample (X_1,\,...,\,X_n)} with independent, identically distributed (i.i.d.) random variables X_1,\,...,\,X_n. In the latter case X_1,\,...,\,X_n are not independent. However, if the population is large enough with respect to the sample size, there is almost no difference between these two methods, so we can assume that sampling is always done with replacement.

A random sample of size n from distribution P is a sequence (X_1,\,...,\,X_n)} of i.i.d. random variables with common distribution P. n is called the sample size and X_i an observation.

Given a random sample, we would like to make some statement about the underlying distribution which is made possible by the Glivenko-Cantelli theorem: For a sequence (X_1,\,...,\,X_n,\,...) of i.i.d. random variables with common distribution F we define the empirical distribution function as

F_n(x) := \frac{|\{i <= n | X_i <= x\}|}n

Then, F_n converges to F uniformly with probability one.

If the distribution of X can be characterized by one or more real numbers (parameters), we speak of parametric statistics, otherwise we speak of non-parametric statistics.

 

If f is a function from R^n to R^{d} and (X_1,\,...,\,X_n)} is a random sample, T = f(X_1,\,...,\,X_n)} is called a statistic.

Important statistics are

sample mean / sample variance

A statistic T is called sufficient for the parameter theta, if the conditional distribution of (X_1,\,...,\,X_n)} given T does not depend on theta.

An important criterion for sufficiency is the following: Let (P_{theta},\,theta in \Theta) be a parametric family of distributions dominated by a measure µ, and f_{theta} = \frac{dP_{theta}{dµ. The statistic T = T(X_1,\,...,\,X_n)} is sufficient for theta if the so-called likelihood-function

L(x_1,\,...,\,x_n,\,theta) = f_{theta}(x_1) * ... * f_{theta}(x_n)

admits a decomposition

L(x_1,\,...,\,x_n,\,theta) = g(T,\,theta) * h(x_1,\,...,\,x_n)

where h(x_1,\,...,\,x_n) does not depend on theta.


 Valid HTML 4.01!
Contents
Back Home Forward
Index