Skip to main content
  1. Grade 12 Mathematics/
  2. Statistics/

Scatter Plots & Bivariate Data

The Logic of Two Variables
#

In Grade 10–11, you worked with one data set at a time (univariate data: mean, median, mode, box plots). In Grade 12, we look at the relationship between two data sets (bivariate data).

The question we ask: Does changing one variable cause (or correlate with) a change in the other?

  • Hours studied vs. marks obtained
  • Temperature vs. ice cream sales
  • Age of car vs. resale value

1. Drawing a Scatter Plot
#

A scatter plot places each data pair $(x; y)$ as a dot on a Cartesian plane.

The Steps
#

  1. Identify the variables: The independent variable (the “cause”) goes on the x-axis. The dependent variable (the “effect”) goes on the y-axis.
  2. Choose appropriate scales: Look at the minimum and maximum values for each variable.
  3. Plot each point: Each row of the data table becomes one dot.
  4. Do NOT connect the dots: Scatter plots show individual data points, not a continuous function.

Example Data
#

Hours Studied ($x$)235678910
Test Mark ($y$)3540556068728085

Plot each pair as a point: $(2; 35)$, $(3; 40)$, $(5; 55)$, etc.


2. Describing the Correlation
#

After plotting, describe the pattern:

Direction
#

PatternName
Dots trend upward (↗)Positive correlation
Dots trend downward (↘)Negative correlation
Dots are scattered randomlyNo correlation

Strength
#

PatternStrength
Dots are close to a straight lineStrong correlation
Dots are loosely grouped around a trendModerate correlation
Dots are widely scatteredWeak correlation

Form
#

PatternForm
Trend follows a straight lineLinear
Trend follows a curveNon-linear (exponential, quadratic, etc.)

3. The Line of Best Fit (by Eye)
#

Before learning the formal regression formula, you should be able to draw a line of best fit by eye:

  1. The line should pass through the middle of the data cloud.
  2. Roughly equal numbers of points should be above and below the line.
  3. The line should pass through the point $(\bar{x}; \bar{y})$ — the mean of both variables.

4. Outliers
#

An outlier is a data point that lies far away from the general trend.

How to identify: A point that is clearly separated from the rest of the scatter plot.

Impact: Outliers can significantly affect the:

  • Mean (pulled toward the outlier)
  • Regression line (tilted toward the outlier)
  • Correlation coefficient (weakened or artificially strengthened)

What to do: Note the outlier. If the question asks you to recalculate after removing it, exclude that data pair from your calculations.


5. Revision: Univariate Data Concepts
#

These Grade 10–11 concepts may still appear in Paper 2:

Measures of Central Tendency
#

  • Mean: $\bar{x} = \frac{\sum x}{n}$
  • Median: Middle value when data is ordered
  • Mode: Most frequent value

Measures of Spread
#

  • Range: Maximum − Minimum
  • Interquartile Range (IQR): $Q_3 - Q_1$
  • Standard Deviation: How far data points typically are from the mean
  • Variance: (Standard Deviation)$^2$

Five Number Summary
#

Minimum, $Q_1$, Median, $Q_3$, Maximum → used to draw Box-and-Whisker plots.

Ogive (Cumulative Frequency Curve)
#

  • Plot cumulative frequencies against upper class boundaries.
  • Use the ogive to estimate the median, quartiles, and percentiles.

🚨 Common Mistakes
#

  1. Swapping x and y: The independent variable (what you control or the “cause”) goes on the x-axis. Getting this wrong changes the entire regression equation.
  2. Confusing correlation with causation: Just because two variables correlate doesn’t mean one causes the other. Ice cream sales and drownings both increase in summer — but ice cream doesn’t cause drowning!
  3. Drawing the line of best fit through $(0; 0)$: The line of best fit does NOT have to pass through the origin unless the data shows it.
  4. Ignoring outliers in interpretation: If there’s a clear outlier, mention it in your answer and explain its potential effect.

💡 Pro Tip: The Mean Point
#

The regression line (whether drawn by eye or calculated) always passes through the point $(\bar{x}; \bar{y})$. If your line doesn’t pass through this point, adjust it.


🏠 Back to Statistics | ⏭️ Regression & Correlation

Related

Tangents to Circles

Master finding tangent equations to circles — the perpendicular radius principle, multiple exam scenarios, tangents from external points, and the length of a tangent with fully worked examples.

The Equation of a Circle

Master the equation of a circle from first principles — derivation from the distance formula, completing the square, determining point positions, and finding equations with fully worked examples.

Remainder & Factor Theorems

Master the Remainder and Factor Theorems from first principles — understand why they work, how to use them to test factors, find remainders, and solve for unknowns with fully worked examples.