Member-only story
Central limit theorem and Law of large numbers
The ultimate goal of machine learning techniques, is to predict the actual probability distribution of data generating process. Now this data generating process could be from any trivial or non-trivial probability distribution and it might not be significant to have any assumption looking at the samples/training data taken from population. Here comes Central limit theorem and give some interesting properties to observe data.
Central Limit theorem
The theorem states that as the size of the sample increases, the distribution of the mean across multiple samples will approximate a Gaussian distribution.
Say there is a random variable X whose probability distribution may or may not be Gaussian. Assume its population mean is μ.
Now, if I select ’n’ samples randomly from X, each of size m, as:-
- S1 = [x11, x12, x13, …, x1m] and its mean is x1
- S2 = [x21, x22, x23, …, x2m] and its mean is x2
- …
- Sn = [xn1, xn2, xn3, …, xnm] and its mean is xn
The the random variable x̄ = [x1, x2, x3, …, xn] will have a Gaussian Distribution with mean same as that of population mean i.e. μ