But what is the Central Limit Theorem?

3Blue1Brown • Mar 13, 2023

Audio Brief

Show transcript

This episode introduces the Central Limit Theorem, explaining why the normal distribution frequently appears across various fields, demonstrating order emerging from seemingly random events. There are four key takeaways from this discussion. First, the sum of many independent, identically distributed random processes tends to form a normal distribution, even if individual processes are not normally distributed. Second, this principle explains the bell curve's prevalence in nature and statistics, as many phenomena result from multiple small, random influences accumulating. Third, when summing N random variables, the mean scales by N, while the standard deviation, which measures spread, scales more slowly by the square root of N. Fourth, the theorem's application requires that random variables be independent, identically distributed, and possess a finite variance. The Galton Board provides a visual demonstration of how individual random paths collectively create a predictable, bell-shaped pattern. Similarly, the episode illustrates the theorem using simplified models, such as repeatedly adding the results of multiple dice rolls, showing how the distribution of the sum approaches a bell curve. Many observations, from human height to market movements, result from numerous small, independent factors. The Central Limit Theorem explains why their aggregate distribution often appears as a bell curve, making complex systems more predictable at scale. The slower growth in standard deviation means that as more variables are summed, the distribution becomes more concentrated around its mean relative to its overall magnitude. This crucial scaling property highlights how aggregated random processes achieve a greater degree of predictability. The assumptions of independence, identical distribution, and finite variance are critical for applying the Central Limit Theorem correctly. Understanding these conditions helps define the theorem's scope and limitations in real-world statistical analysis. Understanding the Central Limit Theorem offers profound insights into the underlying order within apparent randomness.

Episode Overview

The episode introduces the Galton Board as a physical demonstration of how random events can create a predictable pattern, specifically the normal distribution.
It explains that the normal distribution, also known as the bell curve, appears in many seemingly unrelated contexts, from human height to number theory.
The core of the episode is an introduction to the Central Limit Theorem, which explains why the normal distribution is so common.
The theorem is illustrated through simplified models and simulations, such as adding the results of multiple dice rolls, showing how the distribution of the sum approaches a bell curve even if the initial distribution is not bell-shaped.
The video breaks down the key concepts needed to understand the theorem, including random variables, mean, variance, and standard deviation, and explains the formula for the normal distribution.

Key Concepts

Galton Board: A device where balls are dropped through a triangular array of pegs. Each ball's path is random, but their collective distribution at the bottom forms a normal distribution, demonstrating order emerging from chaos.
Normal Distribution: A fundamental probability distribution characterized by its symmetrical, bell-shaped curve. It is also known as the Gaussian distribution or bell curve.
Central Limit Theorem (CLT): A foundational theorem in probability stating that the distribution of the sum (or average) of a large number of independent, identically distributed random variables will be approximately a normal distribution, regardless of the original distribution.
Random Variable: A process where each outcome is associated with a number. The value of the variable is determined by a random event, like the outcome of a die roll.
Mean (μ): The expected value or average of a random variable's distribution. It represents the "center of mass." For a sum of N random variables, the new mean is N times the original mean.
Variance (Var(X) or σ²): A measure of how spread out a distribution is. It's calculated as the expected value of the squared difference from the mean. The variance of a sum of N independent random variables is N times the original variance.
Standard Deviation (σ): The square root of the variance, providing a measure of spread in the same units as the data. For a sum of N random variables, the new standard deviation is the square root of N times the original standard deviation.
68–95–99.7 Rule: A rule of thumb for normal distributions, stating that approximately 68% of values fall within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

Quotes

At 00:00 - "This is a Galton board." - The opening line introducing the physical model used to demonstrate the main concept.
At 01:14 - "...known as the Central Limit Theorem." - Introducing the core mathematical concept the episode will explain.
At 04:18 - "Start with a random variable: X (a random process, where each outcome is associated with some number)." - Beginning the formal, general explanation of the Central Limit Theorem.
At 24:00 - "But what is the actual rigorous no-jokes-this-time Central Limit Theorem?" - Transitioning from intuitive examples to the precise mathematical statement of the theorem.
At 28:13 - "Lastly, but probably most importantly, let's talk about the assumptions that go into this theorem." - Highlighting the critical conditions required for the Central Limit Theorem to apply.

Takeaways

The sum of many independent and identically distributed random processes will tend to form a normal distribution (a bell curve), even if the original process is not normally distributed.
This principle, the Central Limit Theorem, is why the bell curve appears so frequently in nature and statistics; many phenomena are the result of multiple small, random influences being added together.
When summing N random variables, the mean grows by a factor of N, but the standard deviation (a measure of spread) grows more slowly, by a factor of the square root of N.
The three key assumptions for the Central Limit Theorem are that the random variables being summed are independent, are all drawn from the same distribution, and have a finite variance.

Audio Brief

Episode Overview

Key Concepts

Quotes

Takeaways

More from 3Blue1Brown

The dynamics of e^(πi)

But what is a Laplace Transform?

The Physics of Euler's Formula | Laplace Transform Prelude

What was Euclid really doing? | Guest video by Ben Syversen