Descriptive statistics transform raw data into meaningful summaries that reveal patterns, central tendencies, and variability. The three most fundamental measures of central tendency — mean, median, and mode — each tell a different story: the mean (arithmetic average) is sensitive to outliers, the median (middle value) resists them, and the mode (most frequent value) identifies the most common observation. Measures of spread — range, variance, and standard deviation — quantify how dispersed data points are around the center. Standard deviation, the square root of variance, is particularly useful because it shares the same unit as the original data and follows the empirical rule: approximately 68% of values fall within one standard deviation of the mean, 95% within two, and 99.7% within three. Beyond these basics, skewness measures asymmetry in the distribution (positive skew means a longer right tail), while kurtosis measures the heaviness of the tails compared to a normal distribution. These statistics form the foundation of every data-driven decision, from quality control in manufacturing (Six Sigma uses standard deviations to define defect rates) to academic research, financial analysis, and public health surveillance.
Understanding Descriptive Statistics
Descriptive statistics condense a data set into meaningful summary numbers. Central tendency measures (mean, median, mode) tell you where the center of the data lies. Dispersion measures (range, variance, standard deviation, IQR) tell you how spread out the values are. Shape measures (skewness and kurtosis) describe the distribution's symmetry and tail behavior. Together these statistics give a complete picture of your data without needing to look at every individual value.
Population vs Sample Statistics
When your data represents an entire population, you divide by N to get the population variance and standard deviation. When your data is a sample drawn from a larger population, you divide by N-1 (Bessel's correction) to get an unbiased estimate. The sample standard deviation is always slightly larger than the population standard deviation for the same data set. For large N the difference is negligible, but for small samples the correction matters significantly.