statisticsdata visualizationexploratory data analysisdescriptive statisticsdata science

Anscombe's Quartet: Four Datasets That Make Statistics Look Stupid

C. Pearson C. Pearson
/ / 5 min read

Frank Anscombe was annoyed. It was 1973, and statisticians were getting drunk on computing power, feeding numbers into machines, getting summary statistics back, and calling it analysis. So he constructed a trap.

Detailed view of a violin fingerboard showcasing strings and intricate craftsmanship. Photo by Diana ✨ on Pexels.

Four datasets. Same mean. Same variance. Same correlation. Same regression line. Statistically identical by every conventional measure.

Completely, utterly different.

That's Anscombe's Quartet, and fifty years later, most analysts still haven't absorbed the lesson it was designed to teach.

What the Numbers Say (and What They're Hiding)

Here's what all four datasets share, to two decimal places:

  • Mean of x: 9.00
  • Mean of y: 7.50
  • Variance of x: 11.00
  • Variance of y: ~4.12
  • Correlation: 0.816
  • Regression line: y = 3.00 + 0.500x

If you handed those numbers to someone without context, they'd say the datasets are interchangeable. Same story, four times over.

Plot them, and you see this:

  • Dataset I is what you hope your data looks like: a clean linear relationship with some scatter. Normal. Boring. Fine.
  • Dataset II is a perfect curve, unmistakably nonlinear. A straight regression line is the wrong model entirely. The fit isn't just imprecise; it's categorically wrong.
  • Dataset III has a near-perfect linear relationship... except for one outlier that drags the slope sideways. Remove that single point and the correlation collapses from 0.816 to something near 1.0. That outlier is the regression line.
  • Dataset IV is almost philosophically disturbing. All x-values are identical except one. The correlation exists solely because of that one lone point. Without it, there's nothing, no relationship at all, just a vertical stripe of dots.

Same statistics. Four different realities.

Why This Matters More Than You Think

Anscombe's point wasn't just "make pretty charts." It was sharper than that: summary statistics are compression, and compression loses information. When you summarize data, you're making a bet that what you discarded doesn't matter. Sometimes you win that bet. Often you don't.

Consider what goes wrong in each case if you skip visualization:

graph TD
    A[Raw Data] --> B{Visualize first?}
    B -->|No| C[Calculate summary stats]
    B -->|Yes| D[Spot the actual pattern]
    C --> E[Report mean, correlation, R²]
    D --> F[Choose the right model]
    E --> G((Wrong conclusions))
    F --> H((Defensible analysis))

Dataset II gets modeled linearly when it's quadratic, your predictions degrade systematically at the extremes. Dataset III produces a slope that one rogue observation controls entirely; you'd never know to investigate it. Dataset IV lets you believe a correlation is a property of the relationship when it's actually a property of a single data point.

These aren't edge cases in toy datasets. They happen constantly in production data, A/B test logs, sensor readings, survey responses.

The Habit Nobody Actually Has

Every statistics course tells you to explore your data before modeling it. Scatter plots, histograms, box plots, look at the thing before you summarize it.

And yet. The reality of most analysis workflows is: ingest data, run descriptives, build model, report results. Visualization is an afterthought, something you add to the deck to make the slide look less empty.

Partly this is time pressure. Partly it's the quiet confidence that comes from seeing clean summary numbers, a mean, a correlation, a p-value. Numbers feel like answers. A scatter plot feels like a question.

But that instinct is exactly backwards. The scatter plot is the answer. The summary statistic is the question you're hoping the data said yes to.

What Anscombe Actually Built

Anscombe constructed his quartet by hand, reverse-engineering datasets to hit specific statistical targets while looking completely different visually. That's not easy. It was a deliberate demonstration that the relationship between data and its summary is many-to-one: infinite possible datasets, one set of statistics.

The modern version of this idea is the Datasaurus Dozen (Matejka & Fitzmaurice, 2017), thirteen datasets with identical summary statistics where one of them is shaped like a dinosaur. A literal dinosaur. Same mean, same standard deviation, same correlation as a straight line.

At that point, the statistics aren't describing the data. They're describing a shadow of it.

The Fix Is Embarrassingly Simple

Plot your data. Not after you've run your analysis, before. Plot the raw distributions. Plot the relationships between variables. Plot the residuals after you fit a model.

If your data is high-dimensional and you can't plot everything, use dimensionality reduction to look at structure. Use pair plots for smaller feature sets. Build the habit of seeing the data, not just measuring it.

Anscombe didn't need machine learning or a clever algorithm to expose statistical blind spots. He needed four scatter plots and fifty years of people still not getting the message.

Don't be the analyst trusting the correlation coefficient of a dinosaur.

Get Mean Methods in your inbox

New posts delivered directly. No spam.

No spam. Unsubscribe anytime.

Related Reading