STAT 20: Introduction to Probability and Statistics
Concept Questions
To study the impact of receiving permanent resident status on mental health, we compare answers to a psychiatric survey from people who entered and won the US green card lottery to answers from others who entered but did not win.
What kind of study is this?
A randomized trial.
A natural experiment.
An observational study that requires matching.
None of the above.
01:00
To study the impact of childhood trauma on later academic performance, we compare GRE scores for students who lost a close family member in an automobile accident before the age of 8 to GRE scores for students who did not lose a close family member before age 8.
What kind of study is this?
A randomized trial.
A natural experiment.
An observational study that requires matching.
None of the above.
01:00
To study the effectiveness of a blood pressure medication, we enroll 500 patients. We take the blood pressure of all patients before anyone receives medication. We assign the 200 patients with the highest blood pressure readings to get the medication, assigning the others to be controls.
What kind of study is this?
A randomized trial.
A natural experiment.
An observational study that requires matching.
None of the above.
01:00
In the next slide, you will see the first few rows of a dataset containing demographic information on California counties. Scroll to see all of the rows.
We are interested in determining whether a difference in median_edu has a causal effect on homeownership using matching. Which county serves as the best counterfactual match to Fresno County?
Kern County
Alameda County
Contra Costa County
Shasta County
Del Norte County
01:00
name
homeownership
median_edu
metro
smoking_ban
Fresno County
55.0
some_college
yes
none
Colusa County
64.4
hs_diploma
no
none
Del Norte County
60.9
hs_diploma
no
none
Alameda County
55.1
some_college
yes
none
Contra Costa County
69.5
some_college
yes
partial
Glenn County
67.5
hs_diploma
no
none
Shasta County
66.0
some_college
yes
none
Kern County
61.4
hs_diploma
yes
none
San Luis Obispo County
61.4
some_college
yes
none
In this table there are nine counties, five with some_college values for median_edu and four with hs_diploma values.
How many counties of each type will remain after we conduct optimal matching on metro and smoking_ban?
some_college: 4, hs_diploma: 4.
some_college: 5, hs_diploma: 4.
some_college: 2, hs_diploma: 2.
some_college: 2, hs_diploma: 4.
Can’t tell without more information.
01:00
Which R command correctly performs matching on covariates to measure the impact of median_edu on homeownership?
matchit(homeownership ~ median_edu, data = county, method = ‘optimal’, distance = ‘euclidean’)
matchit(median_edu ~ homeownership, data = county, method = ‘optimal’, distance = ‘euclidean’)
matchit(median_edu ~ metro + smoking_ban, data = county, method = ‘optimal’, distance = ‘euclidean’)
matchit(homeownership ~ median_edu + metro + smoking_ban, data = county, method = ‘optimal’, distance = ‘euclidean’)
01:00
Assuming that metro and smoking_ban variables are the only ones we have measured, name an unmeasured variable that could introduce confounding between median_edu and homeownership.