A Comprehensive Tutorial to Propensity Score Matching in R
Author
Zahid Asghar
Published
June 14, 2024
Introduction
Causal inference is a critical aspect of research in fields such as economics, public health, and social sciences. One of the primary challenges in causal inference is estimating the effect of a treatment or intervention on an outcome variable while accounting for potential confounding variables. Conducting Randomized Controlled Trials (RCTs) is considered the gold standard for estimating causal effects. However, in many cases, researchers rely on observational data, where treatment assignment is not random, leading to potential bias in the estimated treatment effect. Propensity score matching is a widely used method to address this issue by balancing covariates between treatment and control groups, thereby reducing bias and improving the validity of causal inference. Propensity score matching (PSM) is a statistical technique designed to address this challenge by accounting for covariates that predict receiving the treatment, making it particularly useful in observational studies where random assignment is not feasible. In this tutorial, I will demonstrate how to perform propensity score matching using various methods in R, highlight the advantages of entropic balancing, and provide a step-by-step guide to assessing balance diagnostics and estimating treatment effects with matched data. This comprehensive guide aims to equip researchers with the necessary tools to conduct robust causal inference using observational data.
I will demonstrate how to perform propensity score matching in R using the MatchIt package and others. Additionally, we will highlight entropic balancing, a useful tool that matches data without losing sample observations. The goal is to minimize the entropy of the covariates in the treated and control groups, thereby creating balance between the groups.
I will also demonstrate how to use the ebal package to perform entropy balancing in R, assess balance diagnostics, and estimate the treatment effect using the matched data. Various matching methods, including nearest neighbor matching, kernel matching, radius matching, propensity score matching, caliper matching, exact matching and entropy balancing will be covered. Additionally, I will demonstrate how to use the cobalt package to assess balance diagnostics and the WeightIt package to perform weighting to estimate the treatment effect.
Load packages
First, we need to load the required R packages for this tutorial. If you haven’t installed these packages yet, you can do so by running the following code:
Code
# Install pacman if not already installedif (!requireNamespace("pacman", quietly =TRUE)) install.packages("pacman")# Use pacman to install and load the required packagespacman::p_load( MatchIt, cobalt, WeightIt, survey, haven, lmtest, tidyverse, patchwork, Matching, ebal)
Load data
We will work with the sample dataset lalonde from the MatchIt package, which contains a treatment group from the National Supported Work Demonstration (NSW) program from the mid-1970s designed to help disadvantaged individuals find employment and improve their economic situation. The data includes information on individuals who participated in the program (treat = 1) and those who did not (treat = 0). We will use propensity score matching to estimate the effect of the program on the earnings of the participants.
The key question we want to answer is:
Did participation in the NSW demonstration program result in higher incomes?
Code
data(lalonde, package ="MatchIt") lalonde$age2 <- lalonde$age^2# Create a new variable age2lalonde |>slice(0) |>glimpse()
treat: A binary variable indicating whether the individual participated in the program (1) or not (0).
age: The age of the individual(in years)
educ: The number of years of education.
race: A factor variable indicating black, hispanic or white.
married: A binary variable indicating whether the individual is married (1) or not (0).
nodegree: A binary variable indicating whether the individual has a degree (1) or not (0).
re74: Earnings in 1974.
re75: Earnings in 1975.
re78: Earnings in 1978.
We have total observations 614 and 10 variables in the dataset. After insepection of data and understanding the variables, we can proceed further for analyzing impact of treatment on earnings in 1978.
t-Test
We can start by performing a simple t-test to compare the earnings in 1978 between the treatment and control groups. There are 185 individuals in the treatment group and 429 individuals in the control group.
Code
t.test(re78 ~ treat, data = lalonde)
Welch Two Sample t-test
data: re78 by treat
t = 0.93773, df = 326.41, p-value = 0.3491
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
-697.192 1967.244
sample estimates:
mean in group 0 mean in group 1
6984.170 6349.144
Results indicate that 6349.14 is the average earnings in 1978 for the treatment group and 6984.17 is the average earnings in 1978 for the control group. Difference in means is -635.03 is a bad sign that NSW program is not effective. But this data set is built from two different data sets NSW and PSID and hence we need to perform propensity score matching to get the correct estimate of treatment effect.
Matching methods
There are various methods to perform propensity score matching. We will start with exact matching and then move to propensity score matching using different methods.
Exact Matching
Exact matching is a method of matching treatment and control units based on exact values of covariates. In this method, treatment and control units are matched if they have the same values on the covariates. This method is useful when there are a small number of covariates and the sample size is large enough to allow for exact matching.
Code
# Perform exact matchingexact_match <-matchit(treat ~ age +age2 + educ + race + married + nodegree,exact =~ age +age2 + educ + race + married + nodegree,data = lalonde)
Warning: Fewer control units than treated units in some `exact` strata; not all
treated units will get a match.
Code
exact_match
A matchit object
- method: 1:1 nearest neighbor matching without replacement
- distance: Propensity score
- estimated with logistic regression
- number of obs.: 614 (original), 90 (matched)
- target estimand: ATT
- covariates: age, age2, educ, race, married, nodegree
Summary of exact_match is given as follows:
Code
# Summary of the matchingsummary(exact_match)
Call:
matchit(formula = treat ~ age + age2 + educ + race + married +
nodegree, data = lalonde, exact = ~age + age2 + educ + race +
married + nodegree)
Summary of Balance for All Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.6204 0.1637 1.7929 1.3755 0.3992
age 25.8162 28.0303 -0.3094 0.4400 0.0813
age2 717.3946 901.7786 -0.4276 0.3627 0.0813
educ 10.3459 10.2354 0.0550 0.4959 0.0347
raceblack 0.8432 0.2028 1.7615 . 0.6404
racehispan 0.0595 0.1422 -0.3498 . 0.0827
racewhite 0.0973 0.6550 -1.8819 . 0.5577
married 0.1892 0.5128 -0.8263 . 0.3236
nodegree 0.7081 0.5967 0.2450 . 0.1114
eCDF Max
distance 0.6752
age 0.1577
age2 0.1577
educ 0.1114
raceblack 0.6404
racehispan 0.0827
racewhite 0.5577
married 0.3236
nodegree 0.1114
Summary of Balance for Matched Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.4837 0.4837 0 1 0
age 20.2444 20.2444 0 1 0
age2 423.5778 423.5778 0 1 0
educ 10.4000 10.4000 0 1 0
raceblack 0.7333 0.7333 0 . 0
racehispan 0.0444 0.0444 0 . 0
racewhite 0.2222 0.2222 0 . 0
married 0.0444 0.0444 0 . 0
nodegree 0.7111 0.7111 0 . 0
eCDF Max Std. Pair Dist.
distance 0 0
age 0 0
age2 0 0
educ 0 0
raceblack 0 0
racehispan 0 0
racewhite 0 0
married 0 0
nodegree 0 0
Sample Sizes:
Control Treated
All 429 185
Matched 45 45
Unmatched 384 140
Discarded 0 0
From this summary we can see that there are matched pairs in the data. We can also see the distribution of the covariates in the matched data. As matched data contain only 45 observations from each treatment and control group, so we lose lot of information and hence we can’t rely on this method. Therefore, we will use other matching methods to estimate the treatment effect, called propensity score matching.
Exact matching diagnostics
love.plot function from cobalt package is used to visualize the diagnostics of the matching. The function plots the standardized mean differences before and after matching for each covariate. The goal is to achieve balance in the covariates between the treatment and control groups after matching.
Code
love.plot(exact_match, drop.distance =FALSE)
Warning: Standardized mean differences and raw mean differences are present in the same plot.
Use the `stars` argument to distinguish between them and appropriately label the x-axis.
Red dots represent the standardized mean differences before matching, and blue dots represent the standardized mean differences after matching. Figure 1 indicates that all adjusted difference are exactly equal to zero. This is because we have performed exact matching, which means that the treatment and control units are matched exactly on the covariates.
Propensity Score Matching
Is a white race person with 12 years of education, 39 years of age and married not same as another person with 12 years of education, 40 years of age and married?
Here, comes issue of PSM as exact match is not there but these two persons seem to be similar. Propensity score matching is a method used to estimate the average treatment effect by matching treatment and control units based on their propensity scores. The propensity score is the probability of receiving the treatment given the observed covariates.
A matchit object
- method: 1:1 nearest neighbor matching without replacement
- distance: Propensity score
- estimated with logistic regression
- number of obs.: 614 (original), 370 (matched)
- target estimand: ATT
- covariates: age, age2, educ, race, married, nodegree
So now instead of 45 matches, we have 185 matches from each group based on 5 covariates. We can see the summary of the matching as follows:
Code
summary(ps_match)
Call:
matchit(formula = treat ~ age + age2 + educ + race + married +
nodegree, data = lalonde)
Summary of Balance for All Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.6204 0.1637 1.7929 1.3755 0.3992
age 25.8162 28.0303 -0.3094 0.4400 0.0813
age2 717.3946 901.7786 -0.4276 0.3627 0.0813
educ 10.3459 10.2354 0.0550 0.4959 0.0347
raceblack 0.8432 0.2028 1.7615 . 0.6404
racehispan 0.0595 0.1422 -0.3498 . 0.0827
racewhite 0.0973 0.6550 -1.8819 . 0.5577
married 0.1892 0.5128 -0.8263 . 0.3236
nodegree 0.7081 0.5967 0.2450 . 0.1114
eCDF Max
distance 0.6752
age 0.1577
age2 0.1577
educ 0.1114
raceblack 0.6404
racehispan 0.0827
racewhite 0.5577
married 0.3236
nodegree 0.1114
Summary of Balance for Matched Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.6204 0.3300 1.1401 1.0782 0.1687
age 25.8162 26.4216 -0.0846 0.7654 0.0381
age2 717.3946 764.6270 -0.1095 0.7823 0.0381
educ 10.3459 10.3135 0.0161 0.5352 0.0284
raceblack 0.8432 0.4486 1.0853 . 0.3946
racehispan 0.0595 0.2054 -0.6172 . 0.1459
racewhite 0.0973 0.3459 -0.8390 . 0.2486
married 0.1892 0.3243 -0.3450 . 0.1351
nodegree 0.7081 0.6486 0.1308 . 0.0595
eCDF Max Std. Pair Dist.
distance 0.4811 1.1401
age 0.1135 1.2375
age2 0.1135 1.1531
educ 0.0649 1.1990
raceblack 0.3946 1.0853
racehispan 0.1459 0.9829
racewhite 0.2486 0.9120
married 0.1351 0.8419
nodegree 0.0595 0.8679
Sample Sizes:
Control Treated
All 429 185
Matched 185 185
Unmatched 244 0
Discarded 0 0
Propensity score matching diagnostics
Code
love.plot(ps_match, drop.distance =FALSE)
Warning: Standardized mean differences and raw mean differences are present in the same plot.
Use the `stars` argument to distinguish between them and appropriately label the x-axis.
Figure 2 indicates that the standardized mean differences are close to zero for covariates other than race after propensity score matching, which indicates that the treatment and control groups are well balanced on the covariates.
Nearest neighbor matching with replacement
The method argument is set to “nearest” to perform nearest neighbor matching. By default, the matchit function performs nearest neighbor matching with replacement, which means that a control unit can be matched to multiple treatment units.
Code
# Perform propensity score matching with nearest neighbor matching without replacementmatch_model <-matchit(treat ~ age + age2 + educ+ race + married + re74 + re75, data = lalonde, method ="nearest", replace =TRUE)# Summary of the matchingsummary(match_model)
Call:
matchit(formula = treat ~ age + age2 + educ + race + married +
re74 + re75, data = lalonde, method = "nearest", replace = TRUE)
Summary of Balance for All Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.6203 0.1638 1.8245 1.2861 0.3988
age 25.8162 28.0303 -0.3094 0.4400 0.0813
age2 717.3946 901.7786 -0.4276 0.3627 0.0813
educ 10.3459 10.2354 0.0550 0.4959 0.0347
raceblack 0.8432 0.2028 1.7615 . 0.6404
racehispan 0.0595 0.1422 -0.3498 . 0.0827
racewhite 0.0973 0.6550 -1.8819 . 0.5577
married 0.1892 0.5128 -0.8263 . 0.3236
re74 2095.5737 5619.2365 -0.7211 0.5181 0.2248
re75 1532.0553 2466.4844 -0.2903 0.9563 0.1342
eCDF Max
distance 0.6760
age 0.1577
age2 0.1577
educ 0.1114
raceblack 0.6404
racehispan 0.0827
racewhite 0.5577
married 0.3236
re74 0.4470
re75 0.2876
Summary of Balance for Matched Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.6203 0.6197 0.0023 0.9859 0.0028
age 25.8162 25.4649 0.0491 0.9766 0.0391
age2 717.3946 699.6486 0.0411 1.1716 0.0391
educ 10.3459 10.1081 0.1183 0.4193 0.0381
raceblack 0.8432 0.8270 0.0446 . 0.0162
racehispan 0.0595 0.0811 -0.0914 . 0.0216
racewhite 0.0973 0.0919 0.0182 . 0.0054
married 0.1892 0.1568 0.0828 . 0.0324
re74 2095.5737 1742.7413 0.0722 1.7125 0.0396
re75 1532.0553 1442.0516 0.0280 1.5596 0.0278
eCDF Max Std. Pair Dist.
distance 0.0486 0.0146
age 0.1135 0.8167
age2 0.1135 0.7885
educ 0.1622 1.1936
raceblack 0.0162 0.1933
racehispan 0.0216 0.4572
racewhite 0.0054 0.2371
married 0.0324 0.6625
re74 0.2162 0.5364
re75 0.1297 0.6045
Sample Sizes:
Control Treated
All 429. 185
Matched (ESS) 42.52 185
Matched 75. 185
Unmatched 354. 0
Discarded 0. 0
Warning: Standardized mean differences and raw mean differences are present in the same plot.
Use the `stars` argument to distinguish between them and appropriately label the x-axis.
Nearest neighbor matching without replacement
To perform nearest neighbor matching without replacement, we can set the replace argument to FALSE. This ensures that each control unit is matched to at most one treatment unit.
Code
# Perform propensity score matchingmatch_model_wor <-matchit(treat ~ age + age2 + educ + race + married + re74 + re75, data = lalonde, method ="nearest", replace =FALSE)# Summary of the matchingsummary(match_model_wor)
Call:
matchit(formula = treat ~ age + age2 + educ + race + married +
re74 + re75, data = lalonde, method = "nearest", replace = FALSE)
Summary of Balance for All Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.6203 0.1638 1.8245 1.2861 0.3988
age 25.8162 28.0303 -0.3094 0.4400 0.0813
age2 717.3946 901.7786 -0.4276 0.3627 0.0813
educ 10.3459 10.2354 0.0550 0.4959 0.0347
raceblack 0.8432 0.2028 1.7615 . 0.6404
racehispan 0.0595 0.1422 -0.3498 . 0.0827
racewhite 0.0973 0.6550 -1.8819 . 0.5577
married 0.1892 0.5128 -0.8263 . 0.3236
re74 2095.5737 5619.2365 -0.7211 0.5181 0.2248
re75 1532.0553 2466.4844 -0.2903 0.9563 0.1342
eCDF Max
distance 0.6760
age 0.1577
age2 0.1577
educ 0.1114
raceblack 0.6404
racehispan 0.0827
racewhite 0.5577
married 0.3236
re74 0.4470
re75 0.2876
Summary of Balance for Matched Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.6203 0.3337 1.1452 1.0188 0.1608
age 25.8162 25.7946 0.0030 0.8198 0.0305
age2 717.3946 727.4703 -0.0234 0.8493 0.0305
educ 10.3459 10.2270 0.0591 0.4109 0.0461
raceblack 0.8432 0.4432 1.1002 . 0.4000
racehispan 0.0595 0.2216 -0.6857 . 0.1622
racewhite 0.0973 0.3351 -0.8025 . 0.2378
married 0.1892 0.3135 -0.3174 . 0.1243
re74 2095.5737 3038.1026 -0.1929 1.1840 0.1010
re75 1532.0553 2026.3935 -0.1536 1.1539 0.0826
eCDF Max Std. Pair Dist.
distance 0.4703 1.1452
age 0.1027 1.1362
age2 0.1027 1.0324
educ 0.1243 1.3603
raceblack 0.4000 1.1002
racehispan 0.1622 1.0514
racewhite 0.2378 0.9484
married 0.1243 0.9523
re74 0.3568 0.7503
re75 0.2000 0.8552
Sample Sizes:
Control Treated
All 429 185
Matched 185 185
Unmatched 244 0
Discarded 0 0
Warning: Standardized mean differences and raw mean differences are present in the same plot.
Use the `stars` argument to distinguish between them and appropriately label the x-axis.
This Figure 4 plot shows the standardized mean differences for each covariate before and after matching. The standardized mean differences are almost zero other than race and re74 variables which are little away (blue dots), indicating that the treatment and control groups are well balanced on the covariates.
Matching with caliper
The caliper argument specifies the maximum distance within which a control unit can be matched to a treatment unit. This is useful for ensuring that the matched units are similar to each other.
Code
match_caliper <-matchit(treat ~ age + age2 + educ + race + married + re74 + re75, data = lalonde, method ="nearest", caliper =0.01)bal.tab(match_caliper)
Balance Measures
Type Diff.Adj
distance Distance 0.0016
age Contin. 0.0945
age2 Contin. 0.0966
educ Contin. 0.2048
race_black Binary 0.0294
race_hispan Binary 0.0000
race_white Binary -0.0294
married Binary 0.0000
re74 Contin. 0.1912
re75 Contin. -0.0282
Sample sizes
Control Treated
All 429 185
Matched 68 68
Unmatched 361 117
Kernel matching
Code
library(Matching)match_kernel <-matchit(treat ~ age + age2 + educ + race+ married + re74 + re75, data = lalonde, method ="optimal")bal.tab(match_kernel)
Balance Measures
Type Diff.Adj
distance Distance 1.1452
age Contin. -0.0121
age2 Contin. -0.0403
educ Contin. 0.0511
race_black Binary 0.4000
race_hispan Binary -0.1622
race_white Binary -0.2378
married Binary -0.1243
re74 Contin. -0.1778
re75 Contin. -0.1397
Sample sizes
Control Treated
All 429 185
Matched 185 185
Unmatched 244 0
Radius matching
Code
match_radius <-matchit(treat ~ age + age2 + educ +race + married + re74 + re75, data = lalonde, method ="nearest", caliper =0.01)bal.tab(match_radius)
Balance Measures
Type Diff.Adj
distance Distance 0.0016
age Contin. 0.0945
age2 Contin. 0.0966
educ Contin. 0.2048
race_black Binary 0.0294
race_hispan Binary 0.0000
race_white Binary -0.0294
married Binary 0.0000
re74 Contin. 0.1912
re75 Contin. -0.0282
Sample sizes
Control Treated
All 429 185
Matched 68 68
Unmatched 361 117
Mahalanobis matching
Code
match_mahal <-matchit(treat ~ age + age2 + educ + race+ married + re74 + re75, data = lalonde, method ="nearest", distance ="mahalanobis")bal.tab(match_mahal)
Balance Measures
Type Diff.Adj
age Contin. 0.0914
age2 Contin. 0.0269
educ Contin. -0.2285
race_black Binary 0.3892
race_hispan Binary 0.0000
race_white Binary -0.3892
married Binary -0.0595
re74 Contin. -0.3539
re75 Contin. -0.2009
Sample sizes
Control Treated
All 429 185
Matched 185 185
Unmatched 244 0
Code
# Outcome analysis on the matched dataoutcome_model <-lm(re78 ~ treat, data = matched_data)summary(outcome_model)
Call:
lm(formula = re78 ~ treat, data = matched_data)
Residuals:
Min 1Q Median 3Q Max
-6349 -5644 -2185 3317 53959
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5644.0 866.4 6.514 3.81e-10 ***
treat 705.1 1027.2 0.686 0.493
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7504 on 258 degrees of freedom
Multiple R-squared: 0.001823, Adjusted R-squared: -0.002046
F-statistic: 0.4712 on 1 and 258 DF, p-value: 0.493
Propensity Score Matching with Logistic Regression
Fit a logistic regression model to estimate propensity scores by including predictors: age, age squared, education, race (black and hispanic), marital status, and prior earnings (re74, re75)
Code
ps_model <-glm(treat ~ age + age2 + educ+race + married + re74 + re75, data = lalonde, family = binomial)# Calculate the propensity scores using the fitted modellalonde$pscore <-predict(ps_model, type ="response")# Perform propensity score matching with nearest neighbor matching without replacementmatch_model_logit <-matchit(treat ~ pscore, data = lalonde, method ="nearest", replace =FALSE)# Print the summary of the matchingsummary(match_model_logit)
Call:
matchit(formula = treat ~ pscore, data = lalonde, method = "nearest",
replace = FALSE)
Summary of Balance for All Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.6181 0.1647 1.6832 1.5921 0.3988
pscore 0.6203 0.1638 1.8245 1.2861 0.3988
eCDF Max
distance 0.676
pscore 0.676
Summary of Balance for Matched Data:
Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
distance 0.6181 0.3076 1.1528 1.0388 0.1608
pscore 0.6203 0.3337 1.1452 1.0188 0.1608
eCDF Max Std. Pair Dist.
distance 0.4703 1.1528
pscore 0.4703 1.1452
Sample Sizes:
Control Treated
All 429 185
Matched 185 185
Unmatched 244 0
Discarded 0 0
Perform balance diagnostics to check if the covariates are balanced after matching
Code
balance_logit <-bal.tab(match_model_logit, un =TRUE)# Print the balance diagnosticsprint(balance_logit)
Balance Measures
Type Diff.Un Diff.Adj
distance Distance 1.6832 1.1528
pscore Contin. 1.8245 1.1452
Sample sizes
Control Treated
All 429 185
Matched 185 185
Unmatched 244 0
This indicates that a difference of 1.15 is adjusted for the covariates.
Conduct outcome analysis on the matched data using a linear model
Code
outcome_model <-lm(re78 ~ treat, data = matched_data_logit)summary(outcome_model)
Call:
lm(formula = re78 ~ treat, data = matched_data_logit)
Residuals:
Min 1Q Median 3Q Max
-6349 -5656 -2093 3263 53959
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5656.0 524.1 10.792 <2e-16 ***
treat 693.1 741.2 0.935 0.35
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7129 on 368 degrees of freedom
Multiple R-squared: 0.002371, Adjusted R-squared: -0.0003402
F-statistic: 0.8745 on 1 and 368 DF, p-value: 0.3503
Entropic balancing
Entropy balancing is a method used in causal inference and observational studies to achieve balance between treated and control groups by reweighting the data. It ensures that the covariate distributions in the treated and control groups are as similar as possible, thus reducing bias when estimating causal effects.
What is Entropic Balancing?
Entropic balancing is based on the principle of entropy maximization. The idea is to find weights for each observation in the dataset such that the weighted empirical distributions of the covariates in the treated and control groups are as similar as possible. This is done by solving an optimization problem where the objective is to maximize entropy, subject to constraints that ensure balance on the covariates.
The process involves:
Defining a set of covariates that need to be balanced.
Specifying moment conditions that the weighted covariates should satisfy (e.g., mean, variance).
Solving a convex optimization problem to find the weights that maximize the entropy while satisfying the moment conditions.
Code
#Balancing covariates between treatment groups (binary)(ebal <-weightit( treat ~ age + age2+ educ + race+ married + re74+ re75, data = lalonde, method ="ebal", estimand ="ATT"))
A weightit object
- method: "ebal" (entropy balancing)
- number of obs.: 614
- sampling weights: none
- treatment: 2-category
- estimand: ATT (focal: 1)
- covariates: age, age2, educ, race, married, re74, re75
Code
summary(ebal)
Summary of weights
- Weight ranges:
Min Max
treated 1.0000 || 1.0000
control 0.0002 |---------------------------| 18.4549
- Units with the 5 most extreme weights by group:
5 4 3 2 1
treated 1 1 1 1 1
584 374 608 601 541
control 16.4637 16.8175 17.93 18.3239 18.4549
- Weight statistics:
Coef of Var MAD Entropy # Zeros
treated 0.000 0.000 0.000 0
control 2.488 1.339 1.374 0
- Effective Sample Sizes:
Control Treated
Unweighted 429. 185
Weighted 59.77 185
# Add weights to the datasetlalonde$weights <- ebal$weights# Create a survey design objectdesign <-svydesign(ids =~1, weights =~weights, data = lalonde)# Weighted regression analysisoutcome_model_ebal <-svyglm(re78 ~ treat, design = design)# Summary of the modelsummary(outcome_model_ebal)
Call:
svyglm(formula = re78 ~ treat, design = design)
Survey design:
svydesign(ids = ~1, weights = ~weights, data = lalonde)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5039.0 679.9 7.411 4.18e-13 ***
treat 1310.1 892.0 1.469 0.142
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 43389364)
Number of Fisher Scoring iterations: 2
From these results one can see that entropy balancing has been successful in balancing the covariates between the treatment and control groups. The standardized mean differences are exactly zero for all covariates, indicating that the treatment and control groups are well balanced on the covariates.
One can also see it graphically through distribution plots of the covariates before and after weighting.
Covariate plots from Entropic balancing
We can also visualize the distribution of covariates before and after matching using the bal.plot function from the cobalt package. This function creates a plot that shows the distribution of covariates before and after matching for the treatment and control groups.
Code
# Function to create covariate distribution plots with custom colorscreate_bal_plot <-function(ebal_obj, var_name) {bal.plot(ebal_obj, var.name = var_name, which ="both",sample.names =c("Unmatched", "Matched")) +scale_fill_manual(values =c("#17becf", "#8c564b"), labels =c("Control", "Treatment")) +labs(title =paste(var_name, "Distribution"), x = var_name) +theme_minimal() +theme(axis.text.x =element_text(angle =45, hjust =1, size =12),plot.title =element_text(size =14))}
Code
# Creating the plots with the specific functionplot.age <-create_bal_plot(ebal, "age")
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Code
plot.educ <-create_bal_plot(ebal, "educ")
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Code
plot.race <-create_bal_plot(ebal, "race")
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Code
plot.married <-create_bal_plot(ebal, "married")
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
distribution_graphs +ggtitle("Distribution Balance in Unmatched and Matched Samples") +theme(plot.title =element_text(size =16, hjust =0.5))
Figure 5 indicates the distribution of covariates before and after weighting. The plots show that the covariates are well balanced between the treatment and control groups after weighting. Entropic balancing has successfully reduced the differences in covariate distributions between the two groups without having reduction in sample size.
Conclusion
In this tutorial, we explored various methods for propensity score matching (PSM) to estimate the causal effect of a treatment on an outcome variable using observational data. We began with an introduction to the importance of causal inference in research and the challenges of accounting for confounding variables. Propensity score matching was highlighted as a key method for addressing these challenges, closely approximating Randomized Control Trials (RCTs).
We demonstrated how to load and inspect the lalonde dataset, which includes information on individuals who participated in the National Supported Work Demonstration (NSW) program. Using this dataset, we performed a t-test to compare earnings between the treatment and control groups, revealing the need for more sophisticated matching techniques to obtain accurate treatment effect estimates.
We covered several matching methods:
Exact Matching: Matching treatment and control units based on exact covariate values, which provided perfect balance but resulted in a small matched sample size.
Propensity Score Matching: Estimating the propensity score and matching units based on these scores, significantly improving balance while retaining more observations.
Nearest Neighbor Matching: Performing nearest neighbor matching with and without replacement to achieve balance, showing how different configurations affect the results.
Caliper, Kernel, Radius, and Mahalanobis Matching: Exploring these methods to further refine the matching process and balance covariates between the treatment and control groups.
Additionally, we introduced Entropic Balancing, a powerful technique that reweights the data to balance covariate distributions between treated and control groups by maximizing entropy. This method achieved excellent balance without losing sample size.
Through balance diagnostics and outcome analysis, we demonstrated the effectiveness of these methods in achieving covariate balance and estimating the treatment effect. The diagnostics plots and statistical summaries provided clear evidence of improved balance and more reliable estimates of the treatment effect.
In conclusion, propensity score matching and entropic balancing are essential tools for causal inference in observational studies. By carefully selecting and applying these methods, researchers can obtain more accurate and credible estimates of treatment effects, even in the absence of random assignment. This tutorial provided a comprehensive overview of these techniques, illustrating their implementation in R and their impact on the analysis of observational data.
Source Code
---title: "Propensity Score Matching Methods for causal inference"subtitle: "A Comprehensive Tutorial to Propensity Score Matching in R"author: "Zahid Asghar"date: "today"format: html: toc: true toc-float: true toc_depth: 3 code-fold: true code-tools: true number_sections: true theme: solarized-light highlight: tango highlight-style: draculaexecute: error: true freeze: auto warnings: false message: false echo: true---# IntroductionCausal inference is a critical aspect of research in fields such as economics, public health, and social sciences. One of the primary challenges in causal inference is estimating the effect of a treatment or intervention on an outcome variable while accounting for potential confounding variables. Conducting Randomized Controlled Trials (RCTs) is considered the gold standard for estimating causal effects. However, in many cases, researchers rely on observational data, where treatment assignment is not random, leading to potential bias in the estimated treatment effect. Propensity score matching is a widely used method to address this issue by balancing covariates between treatment and control groups, thereby reducing bias and improving the validity of causal inference.Propensity score matching (PSM) is a statistical technique designed to address this challenge by accounting for covariates that predict receiving the treatment, making it particularly useful in observational studies where random assignment is not feasible. In this tutorial, I will demonstrate how to perform propensity score matching using various methods in R, highlight the advantages of entropic balancing, and provide a step-by-step guide to assessing balance diagnostics and estimating treatment effects with matched data. This comprehensive guide aims to equip researchers with the necessary tools to conduct robust causal inference using observational data.I will demonstrate how to perform propensity score matching in R using the `MatchIt` package and others. Additionally, we will highlight `entropic balancing`, a useful tool that matches data without losing sample observations. The goal is to minimize the entropy of the covariates in the treated and control groups, thereby creating balance between the groups.I will also demonstrate how to use the `ebal` package to perform entropy balancing in R, assess balance diagnostics, and estimate the treatment effect using the matched data. Various matching methods, including `nearest neighbor matching`, `kernel matching`, `radius matching`, `propensity score matching`, `caliper matching`, `exact matching` and `entropy balancing` will be covered. Additionally, I will demonstrate how to use the `cobalt` package to assess balance diagnostics and the `WeightIt` package to perform weighting to estimate the treatment effect. ## Load packagesFirst, we need to load the required R packages for this tutorial. If you haven't installed these packages yet, you can do so by running the following code:```{r load-packages, message=FALSE, warning=FALSE}# Install pacman if not already installedif (!requireNamespace("pacman", quietly = TRUE)) install.packages("pacman")# Use pacman to install and load the required packagespacman::p_load( MatchIt, cobalt, WeightIt, survey, haven, lmtest, tidyverse, patchwork, Matching, ebal)```## Load dataWe will work with the sample dataset `lalonde` from the `MatchIt` package, which contains a treatment group from the National Supported Work Demonstration (NSW) program from the mid-1970s designed to help disadvantaged individuals find employment and improve their economic situation. The data includes information on individuals who participated in the program (`treat = 1`) and those who did not (`treat = 0`). We will use propensity score matching to estimate the effect of the program on the earnings of the participants. The key question we want to answer is:> Did participation in the NSW demonstration program result in higher incomes? ```{r load-data}data(lalonde, package = "MatchIt") lalonde$age2 <- lalonde$age^2 # Create a new variable age2lalonde |> slice(0) |> glimpse()```This data set contains the following variables:- `treat`: A binary variable indicating whether the individual participated in the program (1) or not (0).- `age`: The age of the individual(in years)- `educ`: The number of years of education.- `race`: A factor variable indicating black, hispanic or white.- `married`: A binary variable indicating whether the individual is married (1) or not (0).- `nodegree`: A binary variable indicating whether the individual has a degree (1) or not (0).- `re74`: Earnings in 1974.- `re75`: Earnings in 1975.- `re78`: Earnings in 1978.We have total observations `r nrow(lalonde)` and `r ncol(lalonde)` variables in the dataset.After insepection of data and understanding the variables, we can proceed further for analyzing impact of treatment on earnings in 1978.## t-TestWe can start by performing a simple t-test to compare the earnings in 1978 between the treatment and control groups. There are `r sum(lalonde$treat == 1)` individuals in the treatment group and `r sum(lalonde$treat == 0)` individuals in the control group.```{r}t.test(re78 ~ treat, data = lalonde) ```Results indicate that `r round(mean(lalonde$re78[lalonde$treat == 1]), 2)` is the average earnings in 1978 for the treatment group and `r round(mean(lalonde$re78[lalonde$treat == 0]), 2)` is the average earnings in 1978 for the control group. Difference in means is `r round(mean(lalonde$re78[lalonde$treat == 1]) - mean(lalonde$re78[lalonde$treat == 0]), 2)` is a bad sign that `NSW` program is not effective. But this data set is built from two different data sets `NSW` and `PSID` and hence we need to perform propensity score matching to get the correct estimate of treatment effect.# Matching methods There are various methods to perform propensity score matching. We will start with exact matching and then move to propensity score matching using different methods.## Exact MatchingExact matching is a method of matching treatment and control units based on exact values of covariates. In this method, treatment and control units are matched if they have the same values on the covariates. This method is useful when there are a small number of covariates and the sample size is large enough to allow for exact matching. ```{r}# Perform exact matchingexact_match <-matchit(treat ~ age +age2 + educ + race + married + nodegree,exact =~ age +age2 + educ + race + married + nodegree,data = lalonde)exact_match```Summary of exact_match is given as follows:```{r}# Summary of the matchingsummary(exact_match)```From this summary we can see that there are `r exact_match$nn` matched pairs in the data. We can also see the distribution of the covariates in the matched data. As matched data contain only 45 observations from each treatment and control group, so we lose lot of information and hence we can't rely on this method. Therefore, we will use other matching methods to estimate the treatment effect, called propensity score matching. ### Exact matching diagnostics`love.plot` function from `cobalt` package is used to visualize the diagnostics of the matching. The function plots the standardized mean differences before and after matching for each covariate. The goal is to achieve balance in the covariates between the treatment and control groups after matching.```{r}#| label: fig-exact-matchlove.plot(exact_match, drop.distance =FALSE) ```Red dots represent the standardized mean differences before matching, and blue dots represent the standardized mean differences after matching. @fig-exact-match indicates that all adjusted difference are exactly equal to zero. This is because we have performed exact matching, which means that the treatment and control units are matched exactly on the covariates.## Propensity Score Matching> Is a white race person with 12 years of education, 39 years of age and married not same as another person with 12 years of education, 40 years of age and married? Here, comes issue of PSM as exact match is not there but these two persons seem to be similar. Propensity score matching is a method used to estimate the average treatment effect by matching treatment and control units based on their propensity scores. The propensity score is the probability of receiving the treatment given the observed covariates. ```{r}ps_match <-matchit(treat ~ age+ age2 + educ + race + married + nodegree,data = lalonde)ps_match```So now instead of 45 matches, we have 185 matches from each group based on 5 covariates. We can see the summary of the matching as follows:```{r}summary(ps_match)```### Propensity score matching diagnostics```{r}#| label: fig-ps-match#| fig.cap: Diagnostics of propensity score matchinglove.plot(ps_match, drop.distance =FALSE)```@fig-ps-match indicates that the standardized mean differences are close to zero for covariates other than race after propensity score matching, which indicates that the treatment and control groups are well balanced on the covariates. ## Nearest neighbor matching with replacementThe `method` argument is set to "nearest" to perform nearest neighbor matching. By default, the `matchit` function performs nearest neighbor matching with replacement, which means that a control unit can be matched to multiple treatment units. ```{r}# Perform propensity score matching with nearest neighbor matching without replacementmatch_model <-matchit(treat ~ age + age2 + educ+ race + married + re74 + re75, data = lalonde, method ="nearest", replace =TRUE)# Summary of the matchingsummary(match_model)# Extract matched datamatched_data <-match.data(match_model)# Balance diagnosticsbalance <-bal.tab(match_model, un =TRUE)# Print balance diagnosticsprint(balance)# Summarize balance diagnosticssummary(balance)```### Nearest neighbor matching diagnostics```{r}#| label: fig-balance-nearest#| fig.cap: Balance diagnostics for nearest neighbor matching without replacement# Plot balance diagnosticslove.plot(balance, var.order ="unadjusted", threshold =0.1)```## Nearest neighbor matching without replacementTo perform nearest neighbor matching without replacement, we can set the `replace` argument to `FALSE`. This ensures that each control unit is matched to at most one treatment unit.```{r}# Perform propensity score matchingmatch_model_wor <-matchit(treat ~ age + age2 + educ + race + married + re74 + re75, data = lalonde, method ="nearest", replace =FALSE)# Summary of the matchingsummary(match_model_wor)# Extract matched datamatched_data_wor <-match.data(match_model_wor)# Balance diagnosticsbalance_wor <-bal.tab(match_model_wor, un =TRUE)# Print balance diagnosticsprint(balance_wor)```### Nearest neighbor matching without replacement diagnostics```{r}#| label: fig-balance-nearest-wor#| fig.cap: Balance diagnostics for nearest neighbor matching without replacementlove.plot(balance_wor, var.order ="unadjusted", threshold =0.1)```This @fig-balance-nearest-wor plot shows the standardized mean differences for each covariate before and after matching. The standardized mean differences are almost zero other than `race` and `re74` variables which are little away (blue dots), indicating that the treatment and control groups are well balanced on the covariates.## Matching with caliper The `caliper` argument specifies the maximum distance within which a control unit can be matched to a treatment unit. This is useful for ensuring that the matched units are similar to each other. ```{r caliper-matching}match_caliper <- matchit(treat ~ age + age2 + educ + race + married + re74 + re75, data = lalonde, method = "nearest", caliper = 0.01)bal.tab(match_caliper)```## Kernel matching```{r kernel-matching}#| error: truelibrary(Matching)match_kernel <- matchit(treat ~ age + age2 + educ + race+ married + re74 + re75, data = lalonde, method = "optimal")bal.tab(match_kernel)```## Radius matching```{r radius-matching}match_radius <- matchit(treat ~ age + age2 + educ +race + married + re74 + re75, data = lalonde, method = "nearest", caliper = 0.01)bal.tab(match_radius)```## Mahalanobis matching```{r mahalanobis-matching}match_mahal <- matchit(treat ~ age + age2 + educ + race+ married + re74 + re75, data = lalonde, method = "nearest", distance = "mahalanobis")bal.tab(match_mahal)``````{r}# Outcome analysis on the matched dataoutcome_model <-lm(re78 ~ treat, data = matched_data)summary(outcome_model)```## Propensity Score Matching with Logistic RegressionFit a logistic regression model to estimate propensity scores by including predictors: age, age squared, education, race (black and hispanic), marital status, and prior earnings (re74, re75) ```{r propensity-score}ps_model <- glm(treat ~ age + age2 + educ+race + married + re74 + re75, data = lalonde, family = binomial)# Calculate the propensity scores using the fitted modellalonde$pscore <- predict(ps_model, type = "response")# Perform propensity score matching with nearest neighbor matching without replacementmatch_model_logit <- matchit(treat ~ pscore, data = lalonde, method = "nearest", replace = FALSE)# Print the summary of the matchingsummary(match_model_logit)``````{r}matched_data_logit <-match.data(match_model_logit)```### Perform balance diagnostics to check if the covariates are balanced after matching ```{r}balance_logit <-bal.tab(match_model_logit, un =TRUE)# Print the balance diagnosticsprint(balance_logit)```This indicates that a difference of 1.15 is adjusted for the covariates. ### Conduct outcome analysis on the matched data using a linear model```{r}outcome_model <-lm(re78 ~ treat, data = matched_data_logit)summary(outcome_model)```## Entropic balancingEntropy balancing is a method used in causal inference and observational studies to achieve balance between treated and control groups by reweighting the data. It ensures that the covariate distributions in the treated and control groups are as similar as possible, thus reducing bias when estimating causal effects.> What is Entropic Balancing? Entropic balancing is based on the principle of entropy maximization. The idea is to find weights for each observation in the dataset such that the weighted empirical distributions of the covariates in the treated and control groups are as similar as possible. This is done by solving an optimization problem where the objective is to maximize entropy, subject to constraints that ensure balance on the covariates.**The process involves:** - Defining a set of covariates that need to be balanced. - Specifying moment conditions that the weighted covariates should satisfy (e.g., mean, variance). - Solving a convex optimization problem to find the weights that maximize the entropy while satisfying the moment conditions.```{r entropic-balancing}#Balancing covariates between treatment groups (binary)(ebal <- weightit( treat ~ age + age2+ educ + race+ married + re74+ re75, data = lalonde, method = "ebal", estimand = "ATT"))``````{r}summary(ebal)``````{r}cobalt::bal.tab(ebal)``````{r}# Add weights to the datasetlalonde$weights <- ebal$weights# Create a survey design objectdesign <-svydesign(ids =~1, weights =~weights, data = lalonde)# Weighted regression analysisoutcome_model_ebal <-svyglm(re78 ~ treat, design = design)# Summary of the modelsummary(outcome_model_ebal)```From these results one can see that entropy balancing has been successful in balancing the covariates between the treatment and control groups. The standardized mean differences are exactly zero for all covariates, indicating that the treatment and control groups are well balanced on the covariates.One can also see it graphically through distribution plots of the covariates before and after weighting. ## Covariate plots from Entropic balancingWe can also visualize the distribution of covariates before and after matching using the `bal.plot` function from the `cobalt` package. This function creates a plot that shows the distribution of covariates before and after matching for the treatment and control groups.```{r}# Function to create covariate distribution plots with custom colorscreate_bal_plot <-function(ebal_obj, var_name) {bal.plot(ebal_obj, var.name = var_name, which ="both",sample.names =c("Unmatched", "Matched")) +scale_fill_manual(values =c("#17becf", "#8c564b"), labels =c("Control", "Treatment")) +labs(title =paste(var_name, "Distribution"), x = var_name) +theme_minimal() +theme(axis.text.x =element_text(angle =45, hjust =1, size =12),plot.title =element_text(size =14))}``````{r}# Creating the plots with the specific functionplot.age <-create_bal_plot(ebal, "age")plot.educ <-create_bal_plot(ebal, "educ")plot.race <-create_bal_plot(ebal, "race")plot.married <-create_bal_plot(ebal, "married")library(patchwork)distribution_graphs <- (plot.age + plot.educ)/(plot.race + plot.married)``````{r}#| label: fig-dist#| fig-cap: Covariates distribution before and after weightingdistribution_graphs +ggtitle("Distribution Balance in Unmatched and Matched Samples") +theme(plot.title =element_text(size =16, hjust =0.5))```@fig-dist indicates the distribution of covariates before and after weighting. The plots show that the covariates are well balanced between the treatment and control groups after weighting. Entropic balancing has successfully reduced the differences in covariate distributions between the two groups without having reduction in sample size.## ConclusionIn this tutorial, we explored various methods for propensity score matching (PSM) to estimate the causal effect of a treatment on an outcome variable using observational data. We began with an introduction to the importance of causal inference in research and the challenges of accounting for confounding variables. Propensity score matching was highlighted as a key method for addressing these challenges, closely approximating Randomized Control Trials (RCTs).We demonstrated how to load and inspect the `lalonde` dataset, which includes information on individuals who participated in the National Supported Work Demonstration (NSW) program. Using this dataset, we performed a t-test to compare earnings between the treatment and control groups, revealing the need for more sophisticated matching techniques to obtain accurate treatment effect estimates.We covered several matching methods:- **Exact Matching**: Matching treatment and control units based on exact covariate values, which provided perfect balance but resulted in a small matched sample size.- **Propensity Score Matching**: Estimating the propensity score and matching units based on these scores, significantly improving balance while retaining more observations.- **Nearest Neighbor Matching**: Performing nearest neighbor matching with and without replacement to achieve balance, showing how different configurations affect the results.- **Caliper, Kernel, Radius, and Mahalanobis Matching**: Exploring these methods to further refine the matching process and balance covariates between the treatment and control groups.Additionally, we introduced **Entropic Balancing**, a powerful technique that reweights the data to balance covariate distributions between treated and control groups by maximizing entropy. This method achieved excellent balance without losing sample size.Through balance diagnostics and outcome analysis, we demonstrated the effectiveness of these methods in achieving covariate balance and estimating the treatment effect. The diagnostics plots and statistical summaries provided clear evidence of improved balance and more reliable estimates of the treatment effect.In conclusion, propensity score matching and entropic balancing are essential tools for causal inference in observational studies. By carefully selecting and applying these methods, researchers can obtain more accurate and credible estimates of treatment effects, even in the absence of random assignment. This tutorial provided a comprehensive overview of these techniques, illustrating their implementation in R and their impact on the analysis of observational data.