Central limit theorem

August 29, 2020April 26, 2023Archit Vora

What does CLT says ?

Sum of random samples forms normal distribution
- This samples may not come from normal distribution
Sum forming random distribution implies that mean would also form normal distribution

Straight facts

Central limit theorem helps getting confidence interval for parameters
It works for all distributions when n > 30
For normal distribution it works even if n < 30
Why do we need to have distribution
- To make variance estimation stable
- We want to have just one unknown that is mean
- We need to test normality of samples before applying t-test

Slide from MIT course : https://ocw.mit.edu/courses/mathematics/18-650-statistics-for-applications-fall-2016/

sigma / (sqrt(n)) is standard error of mean. We are saying this distribution reaches to standard normal distribution.

Law of Larger Number

As a sample size grows, its mean gets closer to the average of the whole population. This is due to the sample being more representative of the population

Example:

During significance testing we calculate left hand side. For examples testing fairness of coin that number comes out to be 3.54. Now for standard normal 3*sigma = 3*1 = 3 is 99 % of area. We are further away than it. So we can reject null hypothesis. [1]
Thing to understand is that distribution of Bernoulli parameter(p) is normal.
We are not saying how far observed mean is from 0.5 in Bernoulli distribution. If we were doing that we would not have used sqrt(n).
- Also more importantly Bernoulli can take only two values 0 and 1. From that perspective as well it does not make sense.
- See the equation in the slide below in central limit theorem. It is a normal distribution N(0,1).

Refereces

[0] : Slide from MIT course : https://ocw.mit.edu/courses/mathematics/18-650-statistics-for-applications-fall-2016/

[1] : https://ocw.mit.edu/courses/18-650-statistics-for-applications-fall-2016/resources/mit18_650f16_parametric_ht/

Leave a comment Cancel reply