Chapter 3. Conditional and Unconditional Parallel Trends
Story: Two Cities, Two Trajectories
Imagine two cities — Riverbend and Oakville. Both have similar economic profiles. One day, Riverbend raises its minimum wage while Oakville does not.
Over the next few years, you observe that Riverbend’s employment rates rise slightly faster than Oakville’s. You might think: "The minimum wage helped employment!"
But wait — even before the policy, Riverbend’s employment had been creeping up faster than Oakville’s. Without accounting for these underlying differences in trend, simply comparing "before and after" outcomes would be misleading.
This chapter explores how parallel trends (whether unconditional or conditional) shape our ability to make valid causal claims.
Concept: The Parallel Trends Assumption
At the heart of Difference-in-Differences (DiD) lies a critical assumption: Parallel Trends.
Unconditional Parallel Trends means that, absent treatment, the difference between the treated and control groups would have stayed constant over time — no adjustment needed.
Formally:
Conditional Parallel Trends relaxes this: trends would have been parallel after adjusting for certain observed covariates (like income, education, etc.). Formally:
where X represents conditioning covariates.
Why Does This Matter?
Without parallel trends, the simple DiD estimate could be biased:
If Riverbend was already on a faster trajectory, we'd wrongly attribute all improvement to the policy.
If differences stemmed from pre-existing factors, not treatment, the causal claim would fail.
Thus, parallel trends are about having a valid counterfactual — imagining what would have happened to Riverbend if no policy had been enacted.
Building Causal Intuition
We note:
DiD is a causal method for observational data. It assumes that treatment is not randomized but that trends are comparable.
Selection Bias (from Chapter 2) threatens DiD unless trends are truly parallel.
Ignorability (Chapter 2) underpins conditional parallel trends: after controlling for key covariates, assignment is as-good-as-random over time.
🔗 If you need a refresher on ignorability or selection bias, please revisit Chapter 2.
Deep Dive: LASSO and Normalized Differences (ND)
When parallel trends do not hold unconditionally, we must find covariates that restore them conditionally.
LASSO: Variable Selection
Why LASSO? We often have many potential predictors — demographics, economics, crime rates, etc. LASSO (Least Absolute Shrinkage and Selection Operator) automatically selects variables that best predict treatment.
It shrinks small coefficients to zero, keeping only important covariates.
This prevents overfitting and identifies a sparse, interpretable model.
In our context: We use pre-treatment data to run LASSO and find covariates that predicted which cities were treated. These covariates are critical — they likely influenced both treatment and trends.
Normalized Difference (ND): Checking Balance
After selecting covariates, we check whether the treated and control groups differ too much.
ND formula:
Xˉω,T\bar{X}_{\omega,T}Xˉω,T: Mean of covariate X in treated group
Xˉω,C\bar{X}_{\omega,C}Xˉω,C: Mean of covariate X in control group
S^2: Variances
Mean of covariate X in treated group
Mean of covariateX in control group
Variances
Threshold:
ND < 0.25 is usually considered acceptable balance (Imbens & Rubin, 2015).
If covariates are unbalanced, naive DiD is risky.
Insight: Mind the Gap (and Trend)
Returning to Riverbend and Oakville:
If Riverbend’s employment was already improving faster before the minimum wage hike, unconditional DiD fails.
Conditioning on key covariates (say, % college-educated residents, median income) might restore parallel trends.
Takeaway: Without parallel trends (unconditional or conditional), DiD estimates are biased and misleading. Good causal inference depends on crafting a proper counterfactual.
And beyond research — this explains why you shouldn't believe every online claim: Just because someone says "I took Supplement X and got better" doesn’t prove causality. Without randomization (Chapter 2) or proper trend comparison (Chapter 3), we can't distinguish true effects from coincidences, confounders, or pre-existing trajectories.
Practice: Diagnosing and Conditioning for Parallel Trends
We will:
Use LASSO to select key covariates predicting treatment Check Normalized Differences (ND) for balance Plot Pre-Treatment Trends to visually assess parallelism
Step-by-Step Python Code
Task: Worksheet — Checking Conditional Parallel Trends
🔎 You are the researcher.
Scenario:
Set the treated city to "Santa Barbara".
Set the treatment month to "2019-07".
Instructions:
Load
chapter1log.csvagain.Filter data before July 2019.
Run LASSO to select predictive covariates.
Compute Normalized Differences (ND) for those covariates.
Plot pre-treatment outcome trends.
Questions:
Which covariates had the largest imbalance?
Were the trends visually parallel before July 2019?
Would you trust an unconditional DiD estimate in this case? Why or why not?
What You Learned
How DiD relies on parallel trends Why unconditional trends might fail How to use LASSO and ND to diagnose imbalance How to visually inspect trends before treatment
Last updated