Multiple Linear Regression

STAT 20: Introduction to Probability and Statistics

Agenda

  • Announcements
  • Concept Questions
  • Break
  • Appendix: more practice
  • Worksheet: Multiple Linear Regression

Announcements

  • Lab 2: Flights due tomorrow at 11:59pm.
    • Part 1: Submit like your portfolios - scan and upload
    • Part 2: Render to PDF, download, and upload - make sure to assign pages
  • No reading questions for the next six lectures leading up to the quiz (traditional-style lecture).

Concept Questions

What are \(b_0, b_1, ..., b_p\) in the multiple linear regression formula called?

  • A: Terms
  • B: Coefficients
  • C: Variables
  • D: Predictors
02:00

What values can the variable \(geowest\) in the shown formula take?

\[ \widehat{price} = -15.97+2.87\times food-1.45 \times geowest\]

  • A: any value on the real number line
  • B: either 0 or 1
  • C: "east" or "west"
02:00

What name is given to a level of a categorical variable which is not given an indicator in a linear model?

  • A: Indicator level
  • B: Coefficient
  • C: Reference level
  • D: Primary level
02:00

If you include a numerical variable as an input variable to a linear model, how many terms for it will appear in the model?

  • A: 1
  • B: 2
  • C: 3
  • D: 4
02:00

If you include a categorical variable with 3 levels as an input variable in a linear model, how many terms for it will appear in the model?

  • A: 1
  • B: 2
  • C: 3
  • D: 4
02:00

If I fit a model of bill_length_mm based on bill_depth_mm, and then a second model which includes body_mass_grams as an explanatory variable, the coefficient of bill_depth_mm will always be the same in both models

  • True
  • False
01:00

Break

05:00

Appendix - More practice!

Question 1

m1 <- lm(bill_depth_mm ~ bill_length_mm, data = penguins)
m2 <- lm(bill_depth_mm ~ bill_length_mm + body_mass_g + species, 
         data = penguins)

How many more coefficients does the second model have than the first?

Questions 2-4

Consider the following multiple linear regression model, which will be the subject of the next three review questions.

Question 2

01:00


m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + body_mass_g + species, 
    data = penguins)

Coefficients:
     (Intercept)    bill_length_mm       body_mass_g  speciesChinstrap  
        10.33083           0.09484           0.00117          -0.90748  
   speciesGentoo  
        -5.80117  

Which is the correct interpretation of the coefficient in front of bill length? Select all that apply.

Question 3

01:00


m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + body_mass_g + species, 
    data = penguins)

Coefficients:
     (Intercept)    bill_length_mm       body_mass_g  speciesChinstrap  
        10.33083           0.09484           0.00117          -0.90748  
   speciesGentoo  
        -5.80117  

Which is the correct interpretation of the coefficient in front of Gentoo?

Question 4

01:00


m2

Call:
lm(formula = bill_depth_mm ~ bill_length_mm + body_mass_g + species, 
    data = penguins)

Coefficients:
     (Intercept)    bill_length_mm       body_mass_g  speciesChinstrap  
        10.33083           0.09484           0.00117          -0.90748  
   speciesGentoo  
        -5.80117  

How would this linear model best be visualized?

Worksheet: Multiple Linear Regression

30:00