Gate Square “Creator Certification Incentive Program” — Recruiting Outstanding Creators!
Join now, share quality content, and compete for over $10,000 in monthly rewards.
How to Apply:
1️⃣ Open the App → Tap [Square] at the bottom → Click your [avatar] in the top right.
2️⃣ Tap [Get Certified], submit your application, and wait for approval.
Apply Now: https://www.gate.com/questionnaire/7159
Token rewards, exclusive Gate merch, and traffic exposure await you!
Details: https://www.gate.com/announcements/article/47889
The Invisible Power of Covariates: How to Overcome Selection Bias in A/B Tests
The Problem Nobody Wants to See
Imagine: A large e-commerce company rolls out a new design banner and measures the average session duration. The initial look at the data promises a lot – an increase of 0.56 minutes (about 33 seconds per session). Sounds promising, right? But here begins the adventure of statistical deep analysis.
The dilemma: How confident can we be that the banner is truly the reason for this improvement? What if older, tech-savvy users systematically see the new banner more often than new customers? The answer leads us to a classic problem in empirical research – selection bias.
T-Test vs. Linear Regression: The Wrong Duel
The classic t-test quickly provides an answer here. The difference between the control and treatment groups is exactly 0.56 minutes – done. But a common mistake: many analysts think linear regression is only relevant for more complex scenarios. That’s false.
What happens if we instead use a linear regression with banner status (1 = visible, 0 = not visible) as an independent variable and session duration as the output? Surprisingly, we get the same treatment coefficient: 0.56 minutes. No coincidence – mathematically, both tests are equivalent under these conditions because they test the same null hypothesis.
However, the R-squared reveals a problem: with only 0.008, we explain less than 1% of the variance. The model ignores many other factors that actually influence how long users stay on the page.
The Game-Changer: Adding Covariates
This is where the true strength of linear regression shows. When we introduce an additional variable – for example, the average session duration of users before the experiment – everything changes dramatically.
The model improves instantly: R-squared jumps to 0.86, explaining 86% of the variance. More importantly: the treatment effect drops to 0.47 minutes. Why? The previous covariate reveals a “snowball effect” – users who already had long sessions tend to show a snowball-like behavior pattern, where small initial differences add up to large effects.
This insight is crucial: the original effect of 0.56 was partly inflated by selection bias. Users with naturally longer sessions were not randomly distributed between groups – they were more concentrated in the treatment group.
The Mathematical Truth: ATE, ATT, and SB
To express this formally:
The naive difference between group means mixes these quantities:
Naive estimate = ATE + SB
With covariates, we can mitigate the bias and get closer to the true effect.
Validation through Simulation
In a controlled experiment where the true effect is known (0.5 minutes), it shows: