Heteroscedasticity: Why Your Regression's Error Bars Are Lying in a Pattern
C. PearsonYour regression model looks fine. R-squared is respectable. The coefficients make intuitive sense. You ship the analysis.
Photo by Brett Jordan on Pexels.
Then someone asks why the model performs so well on small accounts and falls apart on large ones. You shrug. You blame the data. The real culprit has been sitting in your residual plot the whole time, waiting to be noticed.
Heteroscedasticity. The word is a mouthful, but the concept is simple: your model's errors are not equally spread across all values of your predictor. They fan out, or funnel in, or cluster in ways that reveal your model is not treating the problem uniformly. That non-uniformity matters enormously, and most analysts skip right past it.
What Uniform Errors Are Supposed to Look Like
Ordinary least squares regression carries an assumption baked into its math: the variance of the residuals is constant across all fitted values. This property has a name (homoscedasticity, the polite version of the problem) and a visual signature. Plot your residuals against your fitted values, and you should see a horizontal band of points with no discernible shape. Random scatter. Noise without a story.
Heteroscedasticity looks like a story. A cone shape is the classic: errors are tight near low fitted values and sprawl wide near high fitted values. You see this constantly in financial data, where prediction errors for small companies are small and prediction errors for large companies are enormous. You see it in biological data, where measurement noise scales with the thing being measured. You see it almost everywhere real, messy data lives.
graph TD
A[Fit OLS Regression] --> B{Plot Residuals vs Fitted}
B --> C[/Random scatter: homoscedastic/]
B --> D{Cone or fan shape}
D --> E[Heteroscedasticity confirmed]
E --> F[Run Breusch-Pagan or White Test]
F --> G[Transform variables or use WLS]
F --> H[Use robust standard errors]
Why It Breaks Things You Care About
Here is what heteroscedasticity actually does to your output.
Your coefficient estimates stay unbiased. That part is fine. OLS will still find the right average relationship between your variables. The damage shows up in your standard errors, which become wrong. Not a little wrong. Systematically wrong in a direction that depends on the structure of your data.
When standard errors are wrong, every significance test built on top of them is also wrong. Your p-values drift. Your confidence intervals misrepresent the actual uncertainty. A coefficient that looks statistically significant at p = 0.03 might not survive honest standard errors. One that looks marginal might actually be rock solid. You cannot tell which way the problem cuts without investigating.
For prediction intervals, the damage is even more direct. If your errors fan out at high values, then the tight, uniform prediction bands your model produces at high values are simply false. You are telling users "we expect the outcome to fall in this range" when the actual spread of outcomes is far wider.
Detecting It
Residual plots first. Always residual plots first. Plot residuals against fitted values and against each predictor individually. Your eyes will catch a cone shape faster than any test will.
For formal confirmation, the Breusch-Pagan test regresses your squared residuals on your predictors. A significant result means those predictors are explaining variance in your errors, which is exactly what should not be happening. White's test is less assumption-heavy and catches nonlinear forms of the problem that Breusch-Pagan can miss.
Neither test tells you how to fix it. They just confirm what your residual plot already showed you.
Fixing It Without Pretending It Was Never There
Three routes, each with tradeoffs.
Transform the outcome variable. A log transformation of a right-skewed outcome often stabilizes variance because it compresses the high end of the scale where errors tend to blow up. This works well when your data is strictly positive and multiplicative relationships make theoretical sense. The catch: your coefficients now describe relationships on the log scale, which requires careful interpretation.
Weighted least squares gives more influence to observations with smaller error variance and less to observations with larger variance. You need to estimate the weights, which usually means fitting a preliminary model and using the residuals to infer the variance structure. It is iterative and fragile if your weight estimates are off.
Robust standard errors (sometimes called heteroscedasticity-consistent or HC standard errors) leave the coefficient estimates alone and just fix the standard errors using a sandwich estimator. This is the most popular modern solution because it is easy to implement and does not require you to correctly specify the variance structure. In R, it is one function call with the sandwich package. In Python, statsmodels supports it natively. You keep your original model and stop lying about your uncertainty.
The Diagnostic You Are Probably Skipping
Most regression tutorials spend three paragraphs on R-squared and one sentence on residual diagnostics. That imbalance explains a lot of bad modeling in the wild.
R-squared tells you how much variance your model explains. A residual plot tells you whether your model's behavior is consistent across the range of your data. Both matter. A model with an R-squared of 0.85 and a cone-shaped residual plot is confidently wrong for a large portion of your dataset.
Check your residuals. Every time. The pattern in those errors is telling you something your coefficients cannot.
Get Mean Methods in your inbox
New posts delivered directly. No spam.
No spam. Unsubscribe anytime.