Box Plot Calculator
Generate box plot statistics (Q1, Q2, Q3, IQR, whiskers, outliers) from a data set. Enter values for instant results with step-by-step formulas.
Formula
IQR = Q3 - Q1; Outliers: x < Q1 - 1.5*IQR or x > Q3 + 1.5*IQR
Where Q1 is the first quartile (25th percentile), Q3 is the third quartile (75th percentile), IQR is the interquartile range. Values outside 1.5 times IQR from the box edges are mild outliers; values outside 3 times IQR are extreme outliers.
Worked Examples
Example 1: Student Test Scores Analysis
Problem: Analyze the distribution of test scores: 45, 55, 60, 62, 65, 70, 72, 75, 78, 80, 82, 85, 88, 90, 95, 98.
Solution: Sorted: 45, 55, 60, 62, 65, 70, 72, 75, 78, 80, 82, 85, 88, 90, 95, 98\nn = 16, Q1 = 63.5, Q2 (median) = 76.5, Q3 = 87.25\nIQR = 87.25 - 63.5 = 23.75\nLower fence = 63.5 - 1.5(23.75) = 27.875\nUpper fence = 87.25 + 1.5(23.75) = 122.875\nNo outliers. Whiskers: 45 to 98.
Result: Q1 = 63.5 | Median = 76.5 | Q3 = 87.25 | IQR = 23.75 | No outliers
Example 2: Income Data with Outliers
Problem: Weekly earnings in dollars: 200, 250, 300, 320, 350, 380, 400, 420, 450, 500, 550, 800, 1500.
Solution: n = 13, Q1 = 310, Q2 = 400, Q3 = 525\nIQR = 525 - 310 = 215\nLower fence = 310 - 1.5(215) = -12.5\nUpper fence = 525 + 1.5(215) = 847.5\n800 is within fences (not outlier). 1500 > 847.5, so 1500 is a mild outlier.\nExtreme fence = 525 + 3(215) = 1170. Since 1500 > 1170, it is an extreme outlier.
Result: Median = $400 | IQR = $215 | 1 extreme outlier ($1,500)
Frequently Asked Questions
What is a box plot and what does it show?
A box plot (also called a box-and-whisker plot) is a standardized way of displaying the distribution of data based on a five-number summary: minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. The box spans from Q1 to Q3, representing the interquartile range (IQR) containing the middle 50 percent of the data. A line inside the box marks the median. Whiskers extend from the box to the smallest and largest values within 1.5 times the IQR from the box edges. Points beyond the whiskers are plotted individually as outliers. Box plots are valuable because they concisely show data center, spread, skewness, and unusual observations in a single compact graphic.
How are quartiles calculated in a box plot?
Quartiles divide sorted data into four equal parts. The first quartile (Q1) is the value below which 25 percent of data falls, the second quartile (Q2 or median) splits the data in half, and the third quartile (Q3) is the value below which 75 percent of data falls. There are several methods for computing quartiles. The exclusive method (used here) calculates the position as (n+1) times the quartile fraction and interpolates between adjacent values. The inclusive method uses n times the fraction. For odd-sized datasets, some methods include the median in both halves while others exclude it. Different statistical software may produce slightly different quartile values because of these method variations.
How are outliers identified in a box plot?
Outliers in a box plot are identified using the IQR rule. First, compute the inner fences: lower fence = Q1 - 1.5 * IQR, and upper fence = Q3 + 1.5 * IQR. Any data point below the lower fence or above the upper fence is classified as a mild outlier. Extreme outliers are identified using outer fences: Q1 - 3 * IQR and Q3 + 3 * IQR. The 1.5 multiplier was chosen by John Tukey and captures approximately 99.3 percent of data in a normal distribution, meaning about 0.7 percent would be flagged as outliers even in perfectly normal data. This systematic approach is more objective than subjective visual inspection and works well across different distribution shapes.
What does the shape of a box plot tell you about the data?
The shape of a box plot reveals the distribution characteristics at a glance. If the median line is centered in the box and whiskers are equal length, the data is approximately symmetric. If the median is closer to Q1 with a longer upper whisker, the data is right-skewed (positively skewed), common in income data and waiting times. If the median is closer to Q3 with a longer lower whisker, the data is left-skewed (negatively skewed), seen in exam scores with a ceiling effect. The box width represents the IQR, and a narrow box indicates concentrated data while a wide box shows more spread. Multiple outliers on one side reinforce the skewness assessment.
How do you compare multiple groups using box plots?
Side-by-side box plots are one of the most effective ways to compare distributions across groups. When placed on the same axis, you can instantly compare medians (center), IQR widths (spread), whisker lengths (range of typical values), and outlier patterns. If the boxes of two groups do not overlap, there is likely a significant difference between the groups. When the median of one group falls outside the box of another, this is a strong visual indicator of a statistically significant difference. This technique is commonly used in clinical trials to compare treatment groups, in education to compare test scores across schools, and in manufacturing to compare quality across production lines.
What is the difference between a box plot and a histogram?
Box plots and histograms both display data distributions but in fundamentally different ways. Histograms show the full shape of the distribution using bars whose height represents frequency or density for each value range (bin). They reveal modes, gaps, and detailed distributional shape but require choices about bin width that can change the visual appearance. Box plots summarize the distribution with just five numbers plus outliers, making them more compact but hiding multimodality (multiple peaks). Box plots excel at comparing multiple groups side by side and are better at highlighting outliers. Histograms are better for understanding the detailed shape of a single distribution. Using both together provides the most complete picture.