Introduction

In randomized control trials, treatment assignment is random and independent of potential confounders. Ignorability or exchangebility assumption are fulfilled. Determining average causal effects is, therefore, straightforward. However, in observational studies treatment assignment is not random. This makes determining the causal effects of treatment difficult. Individuals get assigned to certain treatment based on their conditions or background characteristics. These characteristics will influence both treatment and outcome and create confounding. There fore the estimated effects sizes will be biased and may be far from causal effects

Let

\(T \in \{0,1\}\) be the treatment,
\(Y\) be the observed outcome,
\(X\) be the set of baseline covariates,
\(Y(1)\) and \(Y(0)\) be potential outcomes under treatment and control.

The individual level causal effect is

\[ \tau = Y(1) - Y(0) \]

but for any given person, only one potential outcome can ever be observed, either the result of receiving the treatment or the result of not receiving it, but never both. This is the fundamental problem of potential outcomes scenario.

If we compare outcomes without adjustment:

\[ E[Y \mid T=1] - E[Y \mid T=0] \]

the difference is generally biased because the groups differ in covariates.

Confounding as a backdoor path

A confounder affects both treatment and outcome. This creates a backdoor path:

\[ T \leftarrow X \to Y \] To identify causal effects, we assume conditional exchangeability:

\[ Y(1), Y(0) \perp T \mid X \]

meaning that, being assigned to either of the treatment arms is as good as random assignment given the set of potential confounders(X).Therefore, after properly adjusting for relevant covariates(potential confounders), treatment assignment can be close to or as good as random. This is called conditional exchangeabilty or conditional ignorability or causal sufficiency.

The propensity score matching method

Propensnsity score is the probability of receiving treatment. More formally, the propensity score (Rosenbaum and Rubin 1983) is defined as:

\[ e(X) = P(T = 1 \mid X) \]

If exchangeability holds given𝑋, then it also holds given the scalar \[e(X)\]

\[ Y(1), Y(0) \perp T \mid e(X) \]

This means that conditioning on a single agregated scalar(score) \(e(X)\) is sufficient to remove confounding due to all observed covariates \(X\).

Goal of Propensity Score Matching

Propensity Score Matching (PSM) matches treated individuals with control individuals using propensity scores, creating a balanced sample. After matching, the Average Treatment Effect on the Treated (ATT) is simply estimated as:

\[ ATT = E[Y(1) - Y(0) \mid T = 1] \]

For this tutorial, I will use the Lalonde Dataset. The data sets has 445 observations with 185 treated and 260 control subjects, and 10 variables.

treatment (1 = treated; 0 = control) (Training program)
age, measured in years;
education , measured in years;
black , indicating race (1 if black, 0 otherwise);
hispanic, indicating race (1 if Hispanic, 0 otherwise);
married, indicating marital status (1 if married, 0 otherwise);
nondegree, indicating high school diploma (1 if no degree, 0 otherwise);
re74, real earnings in 1974;
re75 , real earnings in 1975.

The last variabele is treatment , the real the earnings in 1978. That is our outcome variable. You may take a look at the datasets and the source here.

Before we dive into the analysis, we will construct the DAG for this. We want to see whether the treatment (training program) has an effect on earning(re78). To construct DAG, I used the web-based DAGITTY package. You can also use the dagitty R-package. I found the web-version easier to adjust the plot. We have a simple DAG plot. All of the 9 confounders are connected to both treatment and the outcome, These are classic set of confounders. The aim of matching is to block all backdoor paths from treatment exposure( treat) to outcome(re78) by balancing on these covariates.

Mow let’s proceed to the analysis.

library(tableone) # for easy
library(Matching)
library(MatchIt)

data(lalonde) #load the dataset
df <- lalonde
table(df$race)


 black hispan  white 
   243     72    299

# Create race indicators
df$black <- as.numeric(df$race == "black")
df$hispan <- as.numeric(df$race == "hispan")

# Confounders: let's call them xvars
xvars <- c("age", "educ", "black", "hispan",
           "married", "nodegree", "re74", "re75")

# Table 1 before matching
table1_before <- CreateTableOne(vars = xvars, data = df, strata = "treat", test = FALSE)
print(table1_before, smd = TRUE)

                      Stratified by treat
                       0                 1                 SMD   
  n                        429               185                 
  age (mean (SD))        28.03 (10.79)     25.82 (7.16)     0.242
  educ (mean (SD))       10.24 (2.86)      10.35 (2.01)     0.045
  black (mean (SD))       0.20 (0.40)       0.84 (0.36)     1.668
  hispan (mean (SD))      0.14 (0.35)       0.06 (0.24)     0.277
  married (mean (SD))     0.51 (0.50)       0.19 (0.39)     0.719
  nodegree (mean (SD))    0.60 (0.49)       0.71 (0.46)     0.235
  re74 (mean (SD))     5619.24 (6788.75) 2095.57 (4886.62)  0.596
  re75 (mean (SD))     2466.48 (3292.00) 1532.06 (3219.25)  0.287

The standardized mean differences show imbalance in several baseline covariates. This is why we need to balance. PSM is needed.

Greedy Matching on Covariates

Greedy matching also known as nearest neighbor matching. It uses Mahalanobis distance, Euclidean distance or propensity scores. Fore more details, check the Matchit package

greedy_match <- Match(Tr = df$treat, M = 1, X = df[xvars])
matched1 <- df[unlist(greedy_match[c("index.treated", "index.control")]),]

table_greedy <- CreateTableOne(vars = xvars, data = matched1,
                               strata = "treat", test = FALSE)
print(table_greedy, smd = TRUE)

                      Stratified by treat
                       0                 1                 SMD   
  n                        207               207                 
  age (mean (SD))        24.60 (8.14)      25.04 (7.14)     0.058
  educ (mean (SD))       10.45 (1.91)      10.35 (1.95)     0.050
  black (mean (SD))       0.85 (0.36)       0.86 (0.35)     0.014
  hispan (mean (SD))      0.05 (0.22)       0.05 (0.22)    <0.001
  married (mean (SD))     0.17 (0.38)       0.17 (0.38)    <0.001
  nodegree (mean (SD))    0.71 (0.46)       0.71 (0.46)    <0.001
  re74 (mean (SD))     1930.36 (4062.55) 1877.00 (4662.18)  0.012
  re75 (mean (SD))     1000.14 (2333.95) 1377.81 (3076.01)  0.138

As you can see, greedy matching has already greatly achieved matching. However, matching directly on the covariates improves balance but may not be optimal. Propensity score matching usually performs better.

Estimate the Propensity Score

Estimating propensity score is simply done by doing logistic regression of treatment variable on the set of confounders.

psmodel1 <- glm(treat ~ age + educ + black + 
                  hispan + married + nodegree +
                  re74 + re75, family =
                  binomial(), data = df)

df$pscore <- psmodel1$fitted.values

Check overlap

library(ggplot2)

ggplot(df, aes(x = pscore, fill = factor(treat))) +
  geom_histogram(alpha = 0.4, position = "identity", bins = 30) +
  labs(title = "Propensity score distribution before matching",
       fill = "Treatment", x = "Propensity score")

Then let’s do the matching and check the plot after matching

m.out1 <- matchit(treat ~ age + educ + black + hispan + married +
                    nodegree + re74 + re75,
                  data = df, method = "nearest")

summary(m.out1)


Call:
matchit(formula = treat ~ age + educ + black + hispan + married + 
    nodegree + re74 + re75, data = df, method = "nearest")

Summary of Balance for All Data:
         Means Treated Means Control Std. Mean Diff. Var. Ratio
distance        0.5774        0.1822          1.7941     0.9211
age            25.8162       28.0303         -0.3094     0.4400
educ           10.3459       10.2354          0.0550     0.4959
black           0.8432        0.2028          1.7615          .
hispan          0.0595        0.1422         -0.3498          .
married         0.1892        0.5128         -0.8263          .
nodegree        0.7081        0.5967          0.2450          .
re74         2095.5737     5619.2365         -0.7211     0.5181
re75         1532.0553     2466.4844         -0.2903     0.9563
         eCDF Mean eCDF Max
distance    0.3774   0.6444
age         0.0813   0.1577
educ        0.0347   0.1114
black       0.6404   0.6404
hispan      0.0827   0.0827
married     0.3236   0.3236
nodegree    0.1114   0.1114
re74        0.2248   0.4470
re75        0.1342   0.2876

Summary of Balance for Matched Data:
         Means Treated Means Control Std. Mean Diff. Var. Ratio
distance        0.5774        0.3629          0.9739     0.7566
age            25.8162       25.3027          0.0718     0.4568
educ           10.3459       10.6054         -0.1290     0.5721
black           0.8432        0.4703          1.0259          .
hispan          0.0595        0.2162         -0.6629          .
married         0.1892        0.2108         -0.0552          .
nodegree        0.7081        0.6378          0.1546          .
re74         2095.5737     2342.1076         -0.0505     1.3289
re75         1532.0553     1614.7451         -0.0257     1.4956
         eCDF Mean eCDF Max Std. Pair Dist.
distance    0.1321   0.4216          0.9740
age         0.0847   0.2541          1.3938
educ        0.0239   0.0757          1.2474
black       0.3730   0.3730          1.0259
hispan      0.1568   0.1568          1.0743
married     0.0216   0.0216          0.8281
nodegree    0.0703   0.0703          1.0106
re74        0.0469   0.2757          0.7965
re75        0.0452   0.2054          0.7381

Sample Sizes:
          Control Treated
All           429     185
Matched       185     185
Unmatched     244       0
Discarded       0       0

matched_df <- match.data(m.out1) # we create our full matched datset

Then check visually after matching

ggplot(matched_df, aes(x = pscore, fill = factor(treat))) +
  geom_histogram(alpha = 0.4, position = "identity", bins = 30) +
  labs(title = "Propensity score distribution after matching",
       fill = "Treatment", x = "Propensity score")

Let’s now check smd for the PSM-matched datasets.

table_psm_matched <- CreateTableOne(vars = xvars, data = matched_df,
                               strata = "treat", test = FALSE)
print(table_psm_matched, smd = TRUE)

                      Stratified by treat
                       0                 1                 SMD   
  n                        185               185                 
  age (mean (SD))        25.30 (10.59)     25.82 (7.16)     0.057
  educ (mean (SD))       10.61 (2.66)      10.35 (2.01)     0.110
  black (mean (SD))       0.47 (0.50)       0.84 (0.36)     0.852
  hispan (mean (SD))      0.22 (0.41)       0.06 (0.24)     0.466
  married (mean (SD))     0.21 (0.41)       0.19 (0.39)     0.054
  nodegree (mean (SD))    0.64 (0.48)       0.71 (0.46)     0.150
  re74 (mean (SD))     2342.11 (4238.98) 2095.57 (4886.62)  0.054
  re75 (mean (SD))     1614.75 (2632.35) 1532.06 (3219.25)  0.028

Check covariate balance using Love plot

library(cobalt)

love.plot(m.out1, stats = "mean.diffs",
          threshold = 0.1, abs = TRUE,
          var.order = "unadjusted")

As you can see most standardized mean differences drop below 0.1 after matching. This shows good covariate balance is achieved and we can move on to our causal effect calculations.

Outcome analysis

We use paired t-test to evaluate the outcome data on our PSM-matched data sets.Mean difference is our effect size measure.

y_trt <- matched_df$re78[matched_df$treat == 1]
y_ctrl <- matched_df$re78[matched_df$treat == 0]

t.test(y_trt, y_ctrl, paired = TRUE)


    Paired t-test

data:  y_trt and y_ctrl
t = 1.1866, df = 184, p-value = 0.2369
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -592.7088 2381.4438
sample estimates:
mean of the differences 
               894.3675

The difference in means is the Average Treatment Effect on the Treated (ATT). This estimate reflects the causal effect of the training program on those who participated, under the assumption that all confounders have been adequately balanced.

PSM using caliper

Caliper matching prevents poor matches and often improves covariate balance even further. This is simply achieved by setting caliper value from the Match() function. Observations with higher distance will be removed(less match)

pscore1 <- df$pscore

psmatch2 <- Match(Tr = df$treat, M = 1, X = log(pscore1),
                  replace = FALSE, caliper = 0.1)

matched2 <- df[unlist(psmatch2[c("index.treated", "index.control")]),]

table_caliper <- CreateTableOne(vars = xvars, strata = "treat", test = FALSE,
                                data = matched2)
print(table_caliper, smd = TRUE)

                      Stratified by treat
                       0                 1                 SMD   
  n                        112               112                 
  age (mean (SD))        26.22 (10.82)     26.17 (7.18)     0.006
  educ (mean (SD))       10.53 (2.68)      10.31 (2.33)     0.085
  black (mean (SD))       0.72 (0.45)       0.74 (0.44)     0.040
  hispan (mean (SD))      0.10 (0.30)       0.10 (0.30)    <0.001
  married (mean (SD))     0.23 (0.42)       0.24 (0.43)     0.021
  nodegree (mean (SD))    0.61 (0.49)       0.65 (0.48)     0.092
  re74 (mean (SD))     2578.90 (4484.64) 2156.96 (5627.20)  0.083
  re75 (mean (SD))     1851.00 (2756.63) 1063.68 (2619.06)  0.293

From the smd, values setting the caliper has achieved a better balance but smaller sample size. More unmatched observations are dropped.

Now, outcome analysis after matching using caliper

y_trt <- matched2$re78[matched2$treat == 1]
y_ctrl <- matched2$re78[matched2$treat == 0]

t.test(y_trt, y_ctrl, paired = TRUE)


    Paired t-test

data:  y_trt and y_ctrl
t = 1.3221, df = 111, p-value = 0.1889
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -550.5773 2757.9256
sample estimates:
mean of the differences 
               1103.674

The mean of the difference is 1300(95%CI: -340, 2942) is our final causal estimate. It tells us how much the training program increased the average post intervention earnings among participants.However, the 96%CI includes 0. Hence no causal effect of the treatment(training programs) on post intervention earnings!

Propensity score matching for binary outcome data

For categorical outcomes, the analysis is similar but the effect size measures are causal risk difference, causal risk ration, causal odds ratio, etc. We can take a look at the Right heart catheterization (RHC) data(freely available).

The treatment variable is:

treatment: received RHC (1) or not (0)

The outcome is:

died: death during the hospital stay (1 = died, 0 = survived)

In addition, the dataset contains several baseline characteristics measured at or shortly before the decision to provide RHC treatment, for example:

Demographics: age, female
Clinical severity and physiology: mean blood pressure at baseline (meanbp1), acute physiology score (aps1)
Primary disease category at ICU admission: ARF, CHF, Cirrhosis, Colon cancer, Coma, Lung cancer, Multiple organ system failure with malignancy (MOSF), Multiple organ system failure with sepsis (sepsis)

These variables are strong candidates for confounders, because they are likely to influence both the decision to use RHC and the risk of death. The goal of propensity score matching is to create treated and untreated groups that are comparable with respect to these measured confounders, and then estimate the causal effect of RHC on mortality within this matched sample.

We assume that the decision to use RHC is not random. It is influenced by characteristics of the patient such as age, sex, underlying disease category, and severity at baseline (for example APS score and mean blood pressure). The same characteristics also affect the risk of death(outcome). It has several confounders. For this tutorial, we only look at a handful of them.

Load the data sets and do a quick cleaning

# Load dataset
library(dplyr)
load(url("https://hbiostat.org/data/repo/rhc.sav"))
head(rhc, 3)

               cat1          cat2  ca sadmdte dschdte dthdte lstctdte
1              COPD          <NA> Yes   11142   11151     NA    11382
2     MOSF w/Sepsis          <NA>  No   11799   11844  11844    11844
3 MOSF w/Malignancy MOSF w/Sepsis Yes   12083   12143     NA    12400
  death cardiohx chfhx dementhx psychhx chrpulhx renalhx liverhx
1    No        0     0        0       0        1       0       0
2   Yes        1     1        0       0        0       0       0
3    No        0     0        0       0        0       0       0
  gibledhx malighx immunhx transhx amihx      age    sex      edu
1        0       1       0       0     0 70.25098   Male 12.00000
2        0       0       1       1     0 78.17896 Female 12.00000
3        0       1       1       0     0 46.09198 Female 14.06992
   surv2md1 das2d3pc t3d30 dth30 aps1 scoma1 meanbp1       wblc1 hrt1
1 0.6409912 23.50000    30    No   46      0      41 22.09765620  124
2 0.7549996 14.75195    30    No   50      0      63 28.89843750  137
3 0.3169999 18.13672    30    No   82      0      57  0.04999542  130
  resp1    temp1    pafi1     alb1    hema1     bili1     crea1 sod1
1    10 38.69531  68.0000 3.500000 58.00000 1.0097656 1.1999512  145
2    38 38.89844 218.3125 2.599609 32.50000 0.6999512 0.5999756  137
3    40 36.39844 275.5000 3.500000 21.09766 1.0097656 2.5996094  146
      pot1 paco21      ph1 swang1  wtkilo1 dnr1           ninsclas
1 4.000000     40 7.359375 No RHC 64.69995   No           Medicare
2 3.299805     34 7.329102    RHC 45.69998   No Private & Medicare
3 2.899902     16 7.359375    RHC  0.00000   No            Private
  resp card neuro gastr renal meta hema seps trauma ortho adld3p
1  Yes  Yes    No    No    No   No   No   No     No    No      0
2   No   No    No    No    No   No   No  Yes     No    No     NA
3   No  Yes    No    No    No   No   No   No     No    No     NA
  urin1  race     income  ptid
1    NA white Under $11k 00005
2  1437 white Under $11k 00007
3   599 white   $25-$50k 00009

# Prepare binary variables
mydata <- rhc %>% 
  mutate(
    ARF     = as.numeric(cat1 == "ARF"),
    CHF     = as.numeric(cat1 == "CHF"),
    Cirr    = as.numeric(cat1 == "Cirrhosis"),
    colcan  = as.numeric(cat1 == "Colon Cancer"),
    Coma    = as.numeric(cat1 == "Coma"),
    lungcan = as.numeric(cat1 == "Lung Cancer"),
    MOSF    = as.numeric(cat1 == "MOSF w/Malignancy"),
    sepsis  = as.numeric(cat1 == "MOSF w/Sepsis"),
    female  = as.numeric(sex == "Female"),
    died    = as.numeric(death == "Yes"),
    treatment = as.numeric(swang1 == "RHC")
  ) %>% 
  select(
    ARF, CHF, Cirr, colcan, Coma, lungcan, MOSF, sepsis,
    age, female, meanbp1, aps1, treatment, died
  )

Fit the propensity score model

psmodel <- glm(
  treatment ~ age + female + meanbp1 +
    ARF + CHF + Cirr + colcan + Coma + lungcan + MOSF + sepsis,
  data = mydata,
  family = binomial()
)

mydata$ps <- predict(psmodel, type = "response")

Propensity score matching (1:1 nearest neighbour)

m.out <- matchit(
  treatment ~ age + female + meanbp1 +
    ARF + CHF + Cirr + colcan + Coma + lungcan + MOSF + sepsis,
  data   = mydata,
  method = "nearest",
  distance = mydata$ps,
  ratio  = 1
)

matched_data <- match.data(m.out)

Assessmening Covariate Balance (SMDs)

confounders <- c(
  "age", "female", "meanbp1", "aps",
  "ARF", "CHF", "Cirr", "colcan",
  "Coma", "lungcan", "MOSF", "sepsis"
)

# Before matching
balance_before <- CreateTableOne(
  vars = confounders,
  strata = "treatment",
  data = mydata,
  test = FALSE
)

# After matching
balance_after <- CreateTableOne(
  vars = confounders,
  strata = "treatment",
  data = matched_data,
  test = FALSE
)
cat("Standardized Mean Differences Before Matching:\n")

Standardized Mean Differences Before Matching:

print(balance_before, smd = TRUE)

                     Stratified by treatment
                      0           1           SMD   
  n                   3551        2184              
  female (mean (SD))  0.46 (0.50) 0.41 (0.49)  0.093
  ARF (mean (SD))     0.45 (0.50) 0.42 (0.49)  0.059
  CHF (mean (SD))     0.07 (0.25) 0.10 (0.29)  0.095
  Cirr (mean (SD))    0.05 (0.22) 0.02 (0.15)  0.145
  colcan (mean (SD))  0.00 (0.04) 0.00 (0.02)  0.038
  Coma (mean (SD))    0.10 (0.29) 0.04 (0.20)  0.207
  lungcan (mean (SD)) 0.01 (0.10) 0.00 (0.05)  0.095
  MOSF (mean (SD))    0.07 (0.25) 0.07 (0.26)  0.018
  sepsis (mean (SD))  0.15 (0.36) 0.32 (0.47)  0.415

cat("\n\nStandardized Mean Differences After Matching:\n")



Standardized Mean Differences After Matching:

print(balance_after, smd = TRUE)

                     Stratified by treatment
                      0             1             SMD   
  n                    2184          2184               
  age (mean (SD))     60.90 (17.90) 60.75 (15.63)  0.009
  female (mean (SD))   0.44 (0.50)   0.41 (0.49)   0.044
  meanbp1 (mean (SD)) 70.64 (32.88) 68.20 (34.24)  0.073
  ARF (mean (SD))      0.50 (0.50)   0.42 (0.49)   0.166
  CHF (mean (SD))      0.09 (0.29)   0.10 (0.29)   0.017
  Cirr (mean (SD))     0.03 (0.16)   0.02 (0.15)   0.021
  colcan (mean (SD))   0.00 (0.04)   0.00 (0.02)   0.030
  Coma (mean (SD))     0.04 (0.20)   0.04 (0.20)   0.018
  lungcan (mean (SD))  0.00 (0.04)   0.00 (0.05)   0.010
  MOSF (mean (SD))     0.09 (0.28)   0.07 (0.26)   0.051
  sepsis (mean (SD))   0.23 (0.42)   0.32 (0.47)   0.200

Check covariate balance graphically: Love plot

bal.tab(m.out, un = TRUE)   # print balance summary

Balance Measures
             Type Diff.Un Diff.Adj
distance Distance  0.7374   0.1677
age       Contin. -0.0647  -0.0096
female     Binary -0.0462  -0.0220
meanbp1   Contin. -0.4869  -0.0713
ARF        Binary -0.0290  -0.0824
CHF        Binary  0.0261   0.0050
Cirr       Binary -0.0268  -0.0032
colcan     Binary -0.0012  -0.0009
Coma       Binary -0.0525   0.0037
lungcan    Binary -0.0073   0.0005
MOSF       Binary  0.0045  -0.0137
sepsis     Binary  0.1721   0.0888

Sample sizes
          Control Treated
All          3551    2184
Matched      2184    2184
Unmatched    1367       0

love.plot(
  m.out,
  stats      = "mean.diffs",
  threshold  = 0.1,
  abs        = TRUE,
  var.order  = "unadjusted"
)

As you can see from the love plot(most have below 0.1 mean difference value) and SMD values of the before and after matching(nearly all have smd value of <0.1), there is a good matching after we do the PSM. If you still want to have a more conservative matching, you can it by setting a caliper value.

Causal Effect Estimation (After Matching)

We now use the matched datasets to determine the causal risk difference and the causal risk ratio.

# Number in each matched group
n_treated  <- sum(matched_data$treatment == 1)
n_control  <- sum(matched_data$treatment == 0)

# Calculate risks
risk_treated  <- mean(matched_data$died[matched_data$treatment == 1])
risk_control  <- mean(matched_data$died[matched_data$treatment == 0])

# Standard errors for risks: we need this for 95%CI
se_treated  <- sqrt(risk_treated  * (1 - risk_treated)  / n_treated)
se_control  <- sqrt(risk_control  * (1 - risk_control)  / n_control)

# Risk difference and CI
risk_difference <- risk_treated - risk_control
se_rd <- sqrt(se_treated^2 + se_control^2)

rd_lower <- risk_difference - 1.96 * se_rd
rd_upper <- risk_difference + 1.96 * se_rd

# Risk ratio and CI (log scale)
risk_ratio <- risk_treated / risk_control
se_log_rr <- sqrt((1 - risk_treated)  / (risk_treated  * n_treated) +
                  (1 - risk_control) / (risk_control * n_control))

rr_lower <- exp(log(risk_ratio) - 1.96 * se_log_rr)
rr_upper <- exp(log(risk_ratio) + 1.96 * se_log_rr)

# Print results
risk_treated

[1] 0.6804029

risk_control

[1] 0.6341575

risk_difference; c(rd_lower, rd_upper)

[1] 0.04624542

[1] 0.01812812 0.07436272

risk_ratio;      c(rr_lower, rr_upper)

[1] 1.072924

[1] 1.027862 1.119961

Therefore, the causal risk difference was 0.046 (95%CI:0.018-0.074) and it doesn’t cross the null point(0) indicating that the harmful effect is statistically significant The causal risk ratio was 1.07 (95%CI: 1.028 - 1.12), suggesting that patients who received RHC had a 7 percent higher risk of death compared to similar patients who did not receive RHC.

Conclusion

In summary, we have seen observational treatment assignment creates confounding and the propensity score summarizes multiple confounders into a single scalar value and can help us achieve a matched data sets. These data sets can be much lower than the original unmatched data sets. Matching discards unmatched subjects (often many), leading to: reduced precision. weaker generalizability,and possibly losing rare treatment subgroups.

In contrast to PSM, Inverse Probability of Treatment Weighting (IPTW) offers an attractive alternative that avoids discarding subjects. Instead of removing unmatched individuals, IPTW re-weights each observation by the inverse of its probability of receiving the treatment(1/propensity score value for treated and 1/(1-propensity score value for control subjects)) actually observed. That will be my next post.

After matching, the difference in outcomes can be interpreted as a causal effect for treated individuals.

If you have enjoyed reading this, consider subscribing for upcoming posts.

RWE Data Analysis Using Propensity Score Matching (PSM): Concept, Implementation, and Interpretation