Skip to article frontmatterSkip to article content

The Equation of the Gaussian Distribution Curve

[!info] The Gaussian distribution (also called the normal distribution) is fundamental to statistical analysis and appears naturally in many physical phenomena. Let’s see how it emerges from first principles.

Let’s derive the equation that describes the Gaussian distribution, beginning with a fundamental model of random variation.

Consider a quantity whose true value is XX, but when measured, it’s subject to random uncertainty. We’ll model this uncertainty as arising from many small, independent fluctuations that can be either positive or negative with equal probability.

Specifically, imagine that our measurement is affected by 2n2n small fluctuations, each with magnitude EE. Each fluctuation has equal probability of being positive or negative. The measured value xx can therefore range from X2nEX-2nE (if all fluctuations are negative) to X+2nEX+2nE (if all are positive).

What we want to determine is the probability distribution for observing a particular deviation RR within this range of possible values. This probability depends on how many different ways a specific deviation can occur.

Understanding the Combinatorial Basis

Think about extreme deviations first. A deviation of exactly +2nE+2nE can happen in only one way - when all 2n2n fluctuations are positive. Similarly, a deviation of 2nE-2nE also happens in only one way.

A deviation of (2n2)E(2n-2)E is more likely because it can happen whenever exactly one of the fluctuations is negative (and the rest positive). Since any one of the 2n2n fluctuations could be that negative one, there are 2n2n different ways this deviation could occur.

[!example] Concrete example Imagine just 4 fluctuations (so n=2n=2), each with magnitude E=0.1E=0.1 units:

  • A deviation of +0.4 requires all fluctuations to be +0.1 (only 1 way)
  • A deviation of +0.2 requires 3 positive and 1 negative fluctuation (4 possible ways)
  • A deviation of 0 requires 2 positive and 2 negative fluctuations (6 possible ways) And so on. Notice how the middle values are more likely!

More generally, if we want a total deviation RR equal to 2rE2rE (where rnr ≤ n), this means that out of our 2n2n fluctuations, (n+r)(n+r) must be positive and (nr)(n-r) must be negative. The number of ways to select (n+r)(n+r) positions from 2n2n positions is:

(2n)!(n+r)!(nr)!\frac{(2n)!}{(n+r)!(n-r)!}

This quantity represents the number of possible arrangements that yield our desired deviation. To convert this to a probability, we multiply by the probability of getting any specific arrangement of (n+r)(n+r) positive and (nr)(n-r) negative fluctuations, which is:

(12)n+r(12)nr=(12)2n\left(\frac{1}{2}\right)^{n+r}\left(\frac{1}{2}\right)^{n-r} = \left(\frac{1}{2}\right)^{2n}

The probability of deviation RR is therefore:

(2n)!(n+r)!(nr)!(12)2n\frac{(2n)!}{(n+r)!(n-r)!}\left(\frac{1}{2}\right)^{2n}

Simplifying with Stirling’s Approximation

[!theorem] Stirling’s Approximation For large values of nn:

n!2πn(ne)nn! \approx \sqrt{2\pi n}\left(\frac{n}{e}\right)^n

To evaluate our expression for large nn, we need Stirling’s approximation. Here’s why this approximation works:

Consider that

1nlnxdx=[xlnxx]1n=nlnnn+1\int_1^n \ln x \, dx = [x\ln x - x]_1^n = n\ln n - n + 1

The integral approximates the sum ln1+ln2+ln3+...+lnn\ln 1 + \ln 2 + \ln 3 + ... + \ln n, which equals ln(n!)\ln(n!).

Logarithm approximation The area under the curve of ln(x) approximates the sum of logarithms

Therefore:

ln(n!)nlnnn\ln(n!) \approx n\ln n - n

n!ennnn! \approx e^{-n}n^n

This gives us the basic form, though the complete approximation includes the 2πn\sqrt{2\pi n} factor.

The Continuous Limit

Applying Stirling’s approximation to our probability expression and taking the limit as nn approaches infinity (with appropriate simplifications that involve several algebraic steps), we eventually obtain:

1nπer2n\frac{1}{\sqrt{n\pi}}e^{-\frac{r^2}{n}}

This gives us the essence of the Gaussian form: the probability decreases exponentially with the square of the deviation. Converting to standard notation with xx representing the deviation from the mean value XX, and using a parameter hh related to the width of the distribution:

P(x)=hπeh2x2dxP(x) = \frac{h}{\sqrt{\pi}}e^{-h^2x^2}dx

Where P(x)dxP(x)dx represents the probability of finding a deviation between xx and x+dxx+dx.

Standard Deviation of the Gaussian Distribution

The standard deviation provides a measure of the typical spread of values in the distribution. For a Gaussian distribution, we find the standard deviation by calculating:

σ2=1NNhπeh2x2x2dx=hπx2eh2x2dx\sigma^2 = \frac{1}{N}\int_{-\infty}^{\infty}\frac{Nh}{\sqrt{\pi}}e^{-h^2x^2}x^2\,dx = \frac{h}{\sqrt{\pi}}\int_{-\infty}^{\infty}x^2e^{-h^2x^2}\,dx

This integral equals π2h3\frac{\sqrt{\pi}}{2h^3}, giving us:

σ2=12h2\sigma^2 = \frac{1}{2h^2}

Therefore:

σ=12h\sigma = \frac{1}{\sqrt{2}h}

This allows us to rewrite the probability function in terms of the standard deviation:

P(x)dx=12πσ2ex22σ2dxP(x)dx = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{x^2}{2\sigma^2}}dx

[!success] Important result This is the standard form of the Gaussian distribution used in statistical analysis. Notice how knowing σ completely determines the shape of the distribution.

Areas Under the Gaussian Distribution Curve

A key practical question is: what fraction of measurements will fall within certain limits? To answer this, we need to find the area under portions of the Gaussian curve.

The probability that a measurement falls between 0 and xx is:

0x12πσ2ex22σ2dx\int_0^x \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{x^2}{2\sigma^2}}dx

This integral has been calculated numerically and tabulated. The table below shows these probabilities for different values of x/σx/\sigma:

x/σx/\sigmaProbability of deviation between 0 and xx
0.00.0
0.50.19
1.00.34
1.50.43
2.00.48
3.00.499

Gaussian Distribution Area Figure A1.1: The shaded area represents the probability of a deviation falling between 0 and x.

For the probability that a measurement falls within ±x/σ\pm x/\sigma of the mean (the symmetric interval), we double these values.

[!key] Key values to remember

  • Approximately 68% of all measurements fall within ±1σ\pm 1\sigma of the mean
  • Approximately 95% fall within ±2σ\pm 2\sigma
  • Approximately 99.7% fall within ±3σ\pm 3\sigma (the “three-sigma rule”)

These probabilities form the foundation of statistical inference. When we make statements about the uncertainty of measurements, we often use these standard intervals - particularly the 68% confidence interval (±1σ\pm 1\sigma) and the 95% confidence interval (±2σ\pm 2\sigma).

[!challenge] Think about it Why would scientists often choose to report uncertainties using the “one-sigma” (68%) interval rather than, say, a 90% interval? What are the tradeoffs between choosing wider versus narrower confidence intervals?