# Propensity Score Diagnostics

Lucy D’Agostino McGowan

Wake Forest University

2021-09-01

## Checking balance

• Love plots (Standardized Mean Difference)
• ECDF plots

## Standardized Mean Difference (SMD)

$\LARGE d = \frac{\bar{x}_{treatment}-\bar{x}_{control}}{\sqrt{\frac{s^2_{treatment}+s^2_{control}}{2}}}$

## SMD in R

Calculate standardized mean differences

library(halfmoon)
library(tidyverse)

smds <- tidy_smd(
df,
.vars = c(confounder_1, confounder_2, ...),
.group = exposure,
.wts = wts # weight is optional
)

## SMD in R

Plot them! (in a Love plot!)

ggplot(
data = smds,
aes(
x = abs(smd),
y = variable,
group = weights,
color = weights
)
) +
geom_love()

## Love plot

10:00

## ECDF

For continuous variables, it can be helpful to look at the whole distribution pre and post-weighting rather than a single summary measure

## Unweighted ECDF

ggplot(df, aes(x = wt71, color = factor(qsmk))) +
geom_ecdf() +
scale_color_manual(
"Quit smoking",
values = c("#5154B8", "#5DB854"),
labels = c("Yes", "No")
) +
xlab("Weight in Kg in 1971") +
ylab("Proportion <= x") 

## Weighted ECDF

ggplot(df, aes(x = wt71, color = factor(qsmk))) +
geom_ecdf(aes(weights = w_ate)) +
scale_color_manual(
"Quit smoking",
values = c("#5154B8", "#5DB854"),
labels = c("Yes", "No")
) +
xlab("Weight in Kg in 1971") +
ylab("Proportion <= x (Weighted)") 

## Weighted ECDF

10:00

## 1. Create a “design object” to incorporate the weights

library(survey)

svy_des <- svydesign(
ids = ~ 1,
data = df,
weights = ~ wts
)

## 2. Pass to gtsummary::tbl_svysummary()

library(gtsummary)
tbl_svysummary(svy_des, by = x) |>
# modify_column_hide(ci) to hide CI column
Characteristic 0, N = 1,5651 1, N = 1,5611 Difference2
WEIGHT IN KILOGRAMS IN 1971 69 (60, 80) 69 (59, 79) 0.01
0: WHITE 1: BLACK OR OTHER IN 1971 0.01
0 1,359 (87%) 1,352 (87%)
1 206 (13%) 209 (13%)
AGE IN 1971 43 (33, 52) 43 (33, 53) -0.01
0: MALE 1: FEMALE 0.00
0 764 (49%) 764 (49%)
1 802 (51%) 797 (51%)
NUMBER OF CIGARETTES SMOKED PER DAY IN 1971 20 (10, 25) 20 (10, 30) 0.02
YEARS OF SMOKING 24 (15, 33) 24 (14, 33) 0.00
IN RECREATION, HOW MUCH EXERCISE? IN 1971, 0:much exercise,1:moderate exercise,2:little or no exercise 0.04
0 302 (19%) 294 (19%)
1 665 (42%) 691 (44%)
2 599 (38%) 576 (37%)
IN YOUR USUAL DAY, HOW ACTIVE ARE YOU? IN 1971, 0:very active, 1:moderately active, 2:inactive 0.03
0 700 (45%) 684 (44%)
1 718 (46%) 738 (47%)
2 147 (9.4%) 138 (8.9%)
1 Median (IQR); n (%)
2 Standardized Mean Difference