The Five-Number Summary & Box-and-Whisker Plots
#

The five-number summary is the backbone of Grade 10 statistics. It condenses an entire data set into 5 key values, which you then use to draw a box-and-whisker plot — the most important graph in this section.

Step 1: Sort the Data
#

Always sort from smallest to largest before doing anything. This is the most common mistake in statistics — students skip sorting and get wrong quartiles.

Step 2: Find the Five Numbers
#

Value	What it is	How to find it
Minimum	Smallest value	First number after sorting
$Q_1$ (Lower Quartile)	25th percentile	Median of the bottom half
$Q_2$ (Median)	50th percentile	Middle value of the full data set
$Q_3$ (Upper Quartile)	75th percentile	Median of the top half
Maximum	Largest value	Last number after sorting

Finding the Median ($Q_2$)
#

If $n$ is odd: median = middle value at position $\frac{n+1}{2}$
If $n$ is even: median = average of the two middle values

Finding $Q_1$ and $Q_3$
#

Split the data into two halves at the median. If $n$ is odd, exclude the median from both halves. Then find the median of each half.

Worked Example
#

Data (already sorted): $3;\; 5;\; 7;\; 8;\; 10;\; 12;\; 14;\; 16;\; 18$

$n = 9$ (odd)

Median ($Q_2$): position $\frac{9+1}{2} = 5$th value = 10

Bottom half (exclude median): $3;\; 5;\; 7;\; 8$ $Q_1 = \frac{5 + 7}{2} = 6$

Top half (exclude median): $12;\; 14;\; 16;\; 18$ $Q_3 = \frac{14 + 16}{2} = 15$

Min	$Q_1$	$Q_2$	$Q_3$	Max
3	6	10	15	18

Measures of Spread
#

Measure	Formula	What it tells you
Range	Max $-$ Min = $18 - 3 = 15$	Total spread
IQR	$Q_3 - Q_1 = 15 - 6 = 9$	Spread of the middle 50%

💡 The IQR is more reliable than the range because it ignores extreme values (outliers). Exam questions often ask “which is the better measure of spread?” — the answer is usually IQR.

Drawing a Box-and-Whisker Plot
#

Draw a number line to scale covering the full range
Mark the 5 values on the number line
Draw a box from $Q_1$ to $Q_3$
Draw a vertical line inside the box at the median ($Q_2$)
Draw whiskers (horizontal lines) from the box to the minimum and maximum

Reading a Box Plot
#

Feature	Interpretation
Median centred in box	Data is symmetric
Median closer to $Q_1$	Positively skewed (tail to the right)
Median closer to $Q_3$	Negatively skewed (tail to the left)
Short box, long whiskers	Data has extreme values but the middle 50% is consistent
Long box	The middle 50% of the data is very spread out

Comparing Two Box Plots
#

When asked to compare two data sets using box plots:

Compare the medians — which group performed better overall?
Compare the IQRs — which group was more consistent?
Compare the ranges — which group had more extreme variation?
Comment on skewness — are the distributions similar or different?

Grouped Data
#

When data is given in class intervals (e.g., 40–50, 50–60, …):

You cannot find the exact five-number summary
Use the midpoint of each class to estimate the mean: midpoint $= \frac{\text{lower} + \text{upper}}{2}$
Use an ogive (cumulative frequency curve) to estimate $Q_1$, $Q_2$, and $Q_3$

Estimated Mean from a Frequency Table
#

$$\bar{x} = \frac{\sum f \times x_{\text{mid}}}{\sum f}$$

where $f$ = frequency and $x_{\text{mid}}$ = midpoint of each class.

Drawing and Reading an Ogive (Cumulative Frequency Curve)
#

An ogive plots cumulative frequency against the upper boundary of each class. It lets you estimate the median and quartiles for grouped data.

Worked Example: 50 students’ test scores:

Class	Frequency	Cumulative Frequency	Upper Boundary
$20 \leq x < 30$	$3$	$3$	$30$
$30 \leq x < 40$	$7$	$10$	$40$
$40 \leq x < 50$	$12$	$22$	$50$
$50 \leq x < 60$	$15$	$37$	$60$
$60 \leq x < 70$	$9$	$46$	$70$
$70 \leq x < 80$	$4$	$50$	$80$

How to draw: Plot each (upper boundary, cumulative frequency) point: $(30;\, 3)$, $(40;\, 10)$, $(50;\, 22)$, $(60;\, 37)$, $(70;\, 46)$, $(80;\, 50)$. Start the curve at $(20;\, 0)$. Connect with a smooth S-shaped curve.

How to read quartiles:

Median ($Q_2$): $\frac{50}{2} = 25$th value → go across from $25$ on the $y$-axis to the curve, then down to the $x$-axis → ≈ 52
$Q_1$: $\frac{50}{4} = 12.5$th value → read across from $12.5$ → ≈ 42
$Q_3$: $\frac{3 \times 50}{4} = 37.5$th value → read across from $37.5$ → ≈ 61

⚠️ Common ogive errors: Always plot against the upper boundary, NOT the midpoint. Start the curve at the lower boundary of the first class with cumulative frequency = 0.

🚨 Common Mistakes
#

Not sorting data first: You MUST sort before finding the median and quartiles.
Including the median in both halves: When $n$ is odd, the median itself is excluded from both the bottom and top halves when finding $Q_1$ and $Q_3$.
Box plot not to scale: The number line must be drawn to scale — spacing must be proportional.
Confusing range and IQR: Range = Max $-$ Min. IQR = $Q_3 - Q_1$. They measure different things.
Grouped data: Don’t try to find exact quartiles from grouped data — use midpoints for the mean and an ogive for quartiles.

💡 Pro Tip
#

If a question asks “which measure of central tendency best represents the data?”:

Symmetric data → mean and median are similar, either works
Skewed data or outliers → the median is better (it’s not pulled by extreme values)

🔗 Related Grade 10 topics:
Probability — data analysis connects to probability
📌 Where this leads in Grade 11: Statistics: Standard Deviation & Variance — measuring spread numerically with $\sigma$

🏠 Back to Statistics

The Five-Number Summary & Box-and-Whisker Plots#

Step 1: Sort the Data#

Step 2: Find the Five Numbers#

Finding the Median ($Q_2$)#

Finding $Q_1$ and $Q_3$#

Worked Example#

Measures of Spread#

Drawing a Box-and-Whisker Plot#

Reading a Box Plot#

Comparing Two Box Plots#

Grouped Data#

Estimated Mean from a Frequency Table#

Drawing and Reading an Ogive (Cumulative Frequency Curve)#

🚨 Common Mistakes#

💡 Pro Tip#

Related