Summarizing Categorical Data

STAT 20: Introduction to Probability and Statistics

Agenda

  • Announcements
  • Reading Questions
  • Break
  • Worksheet: Summarizing Categorical Data
  • Appendix: Practice

Announcements

  • My office hours will be Fridays from 9-11am in Evans Hall 340
  • Group tutoring starts will be Thursdays from 5-7pm in Evans Hall 340

Reading Questions

  • Please put your laptops under your desk and your phones away.
  • Write your name, ID, and bubble in Version “A” on your answer sheet.
  • You may work only with those at your table!

Which of the following is TRUE regarding a contingency table?

  • A. It has the levels of one categorical variable along the top of the columns and the levels of a second categorical variable down the left side of the rows.

  • B. It has the units of observation down the rows and the names of the variables recorded on each observation along the top of the columns.

  • C. It has the levels of one numerical variable along the top of the columns and the levels of a second numerical variable down the left side of the rows.

  • D. None of the above are true.

00:35

What proportion of those students in the marching band were juniors?

  • A. 2/60
  • B. 2/5
  • C. 2/24
  • D. 24/60
01:00

Which group is the largest among these voters?

  • A. Independents with no degree
  • B. Democrats with no degree
  • C. Republicans with a degree
  • D. We cannot tell with this plot.
00:30

You are running code that uses an external library. How many often do you need to load the library?

  • A. Every time you run the piece of code in question.
  • B. Once per .qmd document.
  • C. Once every time you start an R session.
  • D. None of these answers are correct.
00:30

How do you add a new layer to a ggplot visualization?

  • A. With the ,.
  • B. With the +.
  • C. With the ;.
  • D. Just start a new line.
00:30

Break

05:00

Now we can see which group is largest!

Worksheet: Summarizing Categorical Data

20:00

End of Lecture

Appendix: Practice!

Concept Questions

  • The next few subquestions are based on the same table.

The table below displays data from a survey on a class of students.

What proportion of the class was in the marching band?

00:30

What proportion were sophomores not in the marching band?

00:30

What were the dimensions of the raw data from which this table was constructed?

00:30

How would you characterize the association between these two variables?

00:30

CQ 3

What does this plot show? Important note: it looks like a leading “1” was cropped from the numbers along the y axis.

01:00