Wake Forest University
2021-09-01
Matching
Weighting
Stratification
Direct Adjustment
…
Greifer, N., & Stuart, E. A. (2021). Choosing the estimand when matching or weighting in observational studies. arXiv preprint arXiv:2106.10577. See also Choosing Estimands in our book.
Matching
Weighting
Stratification
Direct Adjustment
…
\[\tau = E[Y(1) - Y(0)]\]
\[\tau = E[Y(1) - Y(0) | Z = 1]\]
A matchit object
- method: 1:1 nearest neighbor matching without replacement
- distance: Propensity score
- estimated with logistic regression
- number of obs.: 1566 (original), 806 (matched)
- target estimand: ATT
- covariates: sex, race, age, I(age^2), education, smokeintensity, I(smokeintensity^2), smokeyrs, I(smokeyrs^2), exercise, active, wt71, I(wt71^2)
Rows: 806
Columns: 71
$ i <chr> "11", "1220", "15", "1082", "18"…
$ subclass <fct> 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6,…
$ weights <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ seqn <dbl> 428, 23045, 446, 22294, 596, 140…
$ qsmk <dbl> 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,…
$ death <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,…
$ yrdth <dbl> NA, NA, 88, NA, NA, NA, NA, NA, …
$ modth <dbl> NA, NA, 1, NA, NA, NA, NA, NA, N…
$ dadth <dbl> NA, NA, 3, NA, NA, NA, NA, NA, N…
$ sbp <dbl> 135, 159, 141, 113, 151, NA, 125…
$ dbp <dbl> 89, 91, 79, 73, 80, NA, 71, 85, …
$ sex <fct> 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0,…
$ age <dbl> 43, 49, 71, 36, 48, 51, 56, 40, …
$ race <fct> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ income <dbl> 19, 22, 17, 21, 18, 22, 20, 18, …
$ marital <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,…
$ school <dbl> 12, 12, 0, 12, 12, 9, 12, 10, 17…
$ education <fct> 3, 3, 1, 3, 3, 2, 3, 2, 5, 5, 3,…
$ ht <dbl> 176.5938, 160.2812, 147.0938, 17…
$ wt71 <dbl> 63.96, 47.29, 75.64, 68.38, 62.0…
$ wt82 <dbl> 79.83226, 53.07031, 56.69905, 73…
$ wt82_71 <dbl> 15.8722571, 5.7803073, -18.94095…
$ birthplace <dbl> 42, 30, NA, 19, 36, 47, NA, 42, …
$ smokeintensity <dbl> 30, 20, 40, 9, 2, 5, 20, 5, 30, …
$ smkintensity82_71 <dbl> -30, 0, -40, 1, -2, 1, -20, 0, -…
$ smokeyrs <dbl> 24, 29, 41, 30, 30, 29, 11, 20, …
$ asthma <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ bronch <dbl> 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0,…
$ tb <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ hf <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hbp <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 1,…
$ pepticulcer <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ colitis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hepatitis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ chroniccough <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0,…
$ hayfever <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ diabetes <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ polio <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ tumor <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ nervousbreak <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ alcoholpy <dbl> 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1,…
$ alcoholfreq <dbl> 3, 0, 4, 0, 1, 2, 3, 1, 3, 0, 2,…
$ alcoholtype <dbl> 3, 3, 4, 1, 2, 3, 4, 1, 2, 3, 3,…
$ alcoholhowmuch <dbl> 2, 2, NA, 6, 1, 3, NA, 5, 1, 3, …
$ pica <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ headache <dbl> 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,…
$ otherpain <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ weakheart <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ allergies <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ nerves <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ lackpep <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hbpmed <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 1,…
$ boweltrouble <dbl> 0, 2, 1, 2, 0, 0, 0, 0, 0, 2, 0,…
$ wtloss <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ infection <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ active <fct> 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 2,…
$ exercise <fct> 1, 1, 1, 0, 1, 0, 2, 0, 0, 2, 2,…
$ birthcontrol <dbl> 2, 0, 0, 2, 0, 2, 0, 0, 2, 0, 2,…
$ pregnancies <dbl> NA, 4, 15, NA, 3, NA, 4, 2, NA, …
$ cholesterol <dbl> 173, 279, 229, 200, 225, 199, 23…
$ hightax82 <dbl> 0, 0, NA, 0, 0, 0, NA, 0, 0, 0, …
$ price71 <dbl> 2.346680, 2.104980, NA, 2.199707…
$ price82 <dbl> 1.797363, 1.698242, NA, 1.847900…
$ tax71 <dbl> 1.3649902, 1.0498047, NA, 1.1022…
$ tax82 <dbl> 0.5718994, 0.4399414, NA, 0.5718…
$ price71_82 <dbl> 0.54931641, 0.40686035, NA, 0.35…
$ tax71_82 <dbl> 0.7929688, 0.6099854, NA, 0.5303…
$ id <int> 11, 1274, 15, 1135, 18, 564, 23,…
$ censored <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ older <dbl> 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
$ distance <dbl> 0.2384597, 0.2381918, 0.2090935,…
\[\tau = E[Y(1) - Y(0) | Z = 0]\]
A matchit object
- method: 1:1 nearest neighbor matching without replacement
- distance: Propensity score
- estimated with logistic regression
- number of obs.: 1566 (original), 806 (matched)
- target estimand: ATC
- covariates: sex, race, age, I(age^2), education, smokeintensity, I(smokeintensity^2), smokeyrs, I(smokeyrs^2), exercise, active, wt71, I(wt71^2)
Observations with propensity scores (on the linear logit scale) within 0.1 standard errors (the caliper) will be discarded
A matchit object
- method: 1:1 nearest neighbor matching without replacement
- distance: Propensity score [caliper]
- estimated with logistic regression and linearized
- caliper: <distance> (0.063)
- number of obs.: 1566 (original), 780 (matched)
- target estimand: ATT
- covariates: sex, race, age, I(age^2), education, smokeintensity, I(smokeintensity^2), smokeyrs, I(smokeyrs^2), exercise, active, wt71, I(wt71^2)
Rows: 780
Columns: 71
$ i <chr> "11", "1220", "15", "1082", "18"…
$ subclass <fct> 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6,…
$ weights <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ seqn <dbl> 428, 23045, 446, 22294, 596, 140…
$ qsmk <dbl> 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,…
$ death <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,…
$ yrdth <dbl> NA, NA, 88, NA, NA, NA, NA, NA, …
$ modth <dbl> NA, NA, 1, NA, NA, NA, NA, NA, N…
$ dadth <dbl> NA, NA, 3, NA, NA, NA, NA, NA, N…
$ sbp <dbl> 135, 159, 141, 113, 151, NA, 125…
$ dbp <dbl> 89, 91, 79, 73, 80, NA, 71, 85, …
$ sex <fct> 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1,…
$ age <dbl> 43, 49, 71, 36, 48, 51, 56, 40, …
$ race <fct> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ income <dbl> 19, 22, 17, 21, 18, 22, 20, 18, …
$ marital <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3,…
$ school <dbl> 12, 12, 0, 12, 12, 9, 12, 10, 17…
$ education <fct> 3, 3, 1, 3, 3, 2, 3, 2, 5, 5, 2,…
$ ht <dbl> 176.5938, 160.2812, 147.0938, 17…
$ wt71 <dbl> 63.96, 47.29, 75.64, 68.38, 62.0…
$ wt82 <dbl> 79.83226, 53.07031, 56.69905, 73…
$ wt82_71 <dbl> 15.8722571, 5.7803073, -18.94095…
$ birthplace <dbl> 42, 30, NA, 19, 36, 47, NA, 42, …
$ smokeintensity <dbl> 30, 20, 40, 9, 2, 5, 20, 5, 30, …
$ smkintensity82_71 <dbl> -30, 0, -40, 1, -2, 1, -20, 0, -…
$ smokeyrs <dbl> 24, 29, 41, 30, 30, 29, 11, 20, …
$ asthma <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ bronch <dbl> 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0,…
$ tb <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ hf <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hbp <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ pepticulcer <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ colitis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hepatitis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ chroniccough <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0,…
$ hayfever <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ diabetes <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ polio <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ tumor <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ nervousbreak <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ alcoholpy <dbl> 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1,…
$ alcoholfreq <dbl> 3, 0, 4, 0, 1, 2, 3, 1, 3, 0, 1,…
$ alcoholtype <dbl> 3, 3, 4, 1, 2, 3, 4, 1, 2, 3, 3,…
$ alcoholhowmuch <dbl> 2, 2, NA, 6, 1, 3, NA, 5, 1, 3, …
$ pica <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ headache <dbl> 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0,…
$ otherpain <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ weakheart <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ allergies <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ nerves <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ lackpep <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hbpmed <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ boweltrouble <dbl> 0, 2, 1, 2, 0, 0, 0, 0, 0, 2, 0,…
$ wtloss <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ infection <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ active <fct> 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 1,…
$ exercise <fct> 1, 1, 1, 0, 1, 0, 2, 0, 0, 2, 2,…
$ birthcontrol <dbl> 2, 0, 0, 2, 0, 2, 0, 0, 2, 0, 0,…
$ pregnancies <dbl> NA, 4, 15, NA, 3, NA, 4, 2, NA, …
$ cholesterol <dbl> 173, 279, 229, 200, 225, 199, 23…
$ hightax82 <dbl> 0, 0, NA, 0, 0, 0, NA, 0, 0, 0, …
$ price71 <dbl> 2.346680, 2.104980, NA, 2.199707…
$ price82 <dbl> 1.797363, 1.698242, NA, 1.847900…
$ tax71 <dbl> 1.3649902, 1.0498047, NA, 1.1022…
$ tax82 <dbl> 0.5718994, 0.4399414, NA, 0.5718…
$ price71_82 <dbl> 0.54931641, 0.40686035, NA, 0.35…
$ tax71_82 <dbl> 0.7929688, 0.6099854, NA, 0.5303…
$ id <int> 11, 1274, 15, 1135, 18, 564, 23,…
$ censored <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ older <dbl> 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
$ distance <dbl> -1.1611429, -1.1626186, -1.33039…
10:00
Matching
Weighting
Stratification
Direct Adjustment
…
Average Treatment Effect (ATE)
\[\Large w_{ATE} = \frac{Z_i}{p_i} + \frac{1-Z_i}{1 - p_i}\]
Average Treatment Effect Among the Controls (ATC) \[\Large w_{ATC} = \frac{(1-p_i)Z_i}{p_i} + \frac{(1-p_i)(1-Z_i)}{(1-p_i)}\]
Average Treatment Effect Among the Overlap Population \[\Large w_{ATO} = (1-p_i)Z_i + p_i(1-Z_i)\]
Average Treatment Effect (ATE) \(w_{ATE} = \frac{Z_i}{p_i} + \frac{1-Z_i}{1 - p_i}\)
10:00