01:00
A logistic regression model was fit in an attempt to predict the sex of a penguin "male" or "female" based on its body mass (grams).
Assuming that no change to the penguins dataset was made, will the model be predicting the probability of the penguin being male or the probability of the penguin being female?
01:00
 (Intercept)  body_mass_g 
-5.162541644  0.001239819 Which of the expressions given in the poll (math or code) will correctly calculate the predicted probability that a penguin that weighs 4000 g is a female? Select all that apply
01:00
What is the misclassification rate of this model?
# A tibble: 4 × 3
# Groups:   sex [2]
  sex    y_hat      n
  <fct>  <chr>  <int>
1 female female   109
2 female male      56
3 male   female    74
4 male   male      9401:00
   (Intercept)    body_mass_g bill_length_mm 
   -6.91208086     0.00101530     0.06112808 Open up RStudio and fit the model here in the slides. What are the predicted sexes of these two penguins?
01:00
glm()!m2 <- glm(sex ~ body_mass_g + bill_length_mm,
          data = train, family = "binomial")
p_hat <- predict(m2, test, type = "response")
test |>
  select(sex) |>
  mutate(p_hat = p_hat)# A tibble: 70 × 2
   sex    p_hat
   <fct>  <dbl>
 1 female 0.345
 2 male   0.566
 3 female 0.259
 4 male   0.280
 5 male   0.365
 6 female 0.196
 7 male   0.428
 8 female 0.220
 9 male   0.559
10 male   0.279
# ℹ 60 more rowsm2 <- glm(sex ~ body_mass_g + bill_length_mm,
          data = train, family = "binomial")
test |>
  select(sex) |>
  mutate(p_hat = predict(m2, test, type = "response"),
         y_hat = ifelse(p_hat > .5, "male", "female"))# A tibble: 70 × 3
   sex    p_hat y_hat 
   <fct>  <dbl> <chr> 
 1 female 0.345 female
 2 male   0.566 male  
 3 female 0.259 female
 4 male   0.280 female
 5 male   0.365 female
 6 female 0.196 female
 7 male   0.428 female
 8 female 0.220 female
 9 male   0.559 male  
10 male   0.279 female
# ℹ 60 more rowstest |>
  select(sex) |>
  mutate(p_hat = p_hat,
         y_hat = ifelse(p_hat > .5, "male", "female"),
         FP = sex == "female" & y_hat == "male",
         FN = sex == "male" & y_hat == "female")# A tibble: 70 × 5
   sex    p_hat y_hat  FP    FN   
   <fct>  <dbl> <chr>  <lgl> <lgl>
 1 female 0.345 female FALSE FALSE
 2 male   0.566 male   FALSE FALSE
 3 female 0.259 female FALSE FALSE
 4 male   0.280 female FALSE TRUE 
 5 male   0.365 female FALSE TRUE 
 6 female 0.196 female FALSE FALSE
 7 male   0.428 female FALSE TRUE 
 8 female 0.220 female FALSE FALSE
 9 male   0.559 male   FALSE FALSE
10 male   0.279 female FALSE TRUE 
# ℹ 60 more rows