Standard deviation

The standard deviation of a data set is a measure of how "spread out" the data points are in general. Unlike range, which only measures the difference between the maximum and minimum, standard deviation measures the size of differences across the whole data set. Furthermore, unlike variance, standard deviation scales linearly with the values in the dataset; that is, multiplying all of the data points by a constant $k$ always multiplies the standard deviation by $k$.

Formula

For an entire population, the standard deviation is the square root of the variance. Explicitly, for a dataset $X = \{ x_1, x_2, x_3, \dots, x_n \}$ with mean $\overline{x}$ the formula for population standard deviation is \[\sigma = \sqrt{\frac{1}{n}\sum_{i=1}^n (x_i - \overline{x})^2}.\] However, if $X$ is only a sample then not only does the formula for variance change due to Bessel's correction, but the calculated standard deviation ceases to equal the square root of the calculated variance. Usually, a good approximation when $X$ is a sample is \[s = \sqrt{\frac{1}{n - \frac{3}{2}}\sum_{i=1}^n (x_i - \overline{x})^2}.\] Conventionally, $s$ denotes sample standard deviation, while $\sigma$ denotes population standard deviation.

This article is a stub. Help us out by expanding it. Categories:Statistics