Using Time to Measure Causal Effects

STAT 20: Introduction to Probability and Statistics

Concept Questions

Based on the plot, which of these analyses will give us a good estimate of the treatment effect?

  1. Pre/post comparison.
  2. Interrupted time series.
  3. Difference-in-differences.
  4. None of the above.
01:00

Based on the plot, which of these analyses will give us a good estimate of the treatment effect?

  1. Pre/post comparison.
  2. Interrupted time series.
  3. Difference-in-differences.
  4. None of the above.
01:00

Based on the plot, which of these analyses will give us a good estimate of the treatment effect?

  1. Pre/post comparison.
  2. Interrupted time series.
  3. Difference-in-differences.
  4. None of the above.
01:00

Based on the plot, which of these analyses will give us a good estimate of the treatment effect?

  1. Pre/post comparison.
  2. Interrupted time series.
  3. Difference-in-differences.
  4. None of the above.
01:00

Based on the plot, which of these analyses will give us a good estimate of the treatment effect?

  1. Pre/post comparison.
  2. Interrupted time series.
  3. Difference-in-differences.
  4. None of the above.
01:00

A statistician conducts a pre/post comparison and attempts to obtain a confidence interval for their treatment effect estimate using the bootstrap. Shown below is the original data (at left) and one of the bootstrap samples (at right).

Original Sample:

Subject Response Time_Period
Jimmy 1.0 Pre
Jimmy 1.5 Post
Sarita 4.0 Pre
Sarita 4.2 Post
Min 1.8 Pre
Min 2.3 Post

Bootstrap Sample:

Subject Response Time_Period
Jimmy 1.5 Post
Jimmy 1.5 Post
Sarita 4.0 Pre
Sarita 4.2 Post
Sarita 4.0 Pre
Min 2.3 Post
01:00

What is the problem with this way of using the bootstrap?

A. The bootstrap sample does not contain the right number of observations.

B. Some of the observations in the bootstrap sample are exact copies of each other.

C. Unique subjects in the bootstrap sample do not have one “pre” and one “post” observation each.

D. There is no problem, this is a valid use of the bootstrap.

library(tidyverse)
library(infer)
toy_example <- data.frame('Subject' = c(rep('Jimmy',2),
                                        rep('Sarita',2),
                                        rep('Min',2)),
                          'Response' = c(1.0,1.5,4.0,4.2,1.8,2.3),
                          'Time_Period' = rep(c('Pre','Post'),3))

Incorrect:

toy_example |>
  specify(response = Response,
          explanatory = Time_Period) |>
  generate(reps = 500, 
           type = 'bootstrap') |>
  calculate(stat = 'diff in means', 
            order = c('Post','Pre')) |>
  visualize()

Correct:

toy_example |>
  pivot_wider(names_from = Time_Period, 
              values_from = Response) |>
  mutate(diff = Post - Pre) |>
  specify(response = diff) |>
  generate(reps = 500, 
           type = 'bootstrap') |>
  calculate(stat = 'mean') |>
  visualize(bins=4) + xlim(-2.5,2.5)