05:00
In groups of 3, take turns introducing yourselves to one another by providing the info listed on the handout (your name, hometown, etc).
Each person should finish with a handout filled-in with info on their groupmates. Make sure you save this for later this week!
05:00
What’s going on here?
05:00
Does this image change which claims are more or less likely?
Up and down vote explanations at pollev.com. /
Understand
the World
Data
Understand
the World
Data
We can call the process of:
This lifecycle involves constructing and critiquing claims made using data: which is the main goal of our course!
To learn to critique and construct
claims made using data.
To learn to critique and construct
claims made using data.
To learn to critique and construct
claims made using data.
To learn to critique and construct
claims made using data.
To learn to critique and construct
claims made using data.
To learn to critique and construct
claims made using data.
To learn to critique and construct
claims made using data.
A numerical, graphical, or verbal description of an aspect of data that is on hand.
Example
Using data from the Stat 20 class survey, the proportion of respondents to the survey who reported having no experience writing computer code is 70%.
A numerical, graphical, or verbal description of a broader set of units than those on which data was been recorded.
Example
Using data from the Stat 20 class survey, the proportion of Berkeley students who have no experience writing computer code is 70%.
A claim that changing the value of one variable will influence the value of another variable.
Example
Data from a randomized controlled experiment shows that taking a new antibiotic eliminates more than 99% of bacterial infections.
A guess about the value of an unknown variable, based on other known variables.
Example
Based on reading the news and the price of Uber’s stock today, I predict that Uber’s stock price will go up 1.2% tomorrow.
We will now re-examine a few pathways in the data science lifecycle:
Now, visit the PollEverywhere site provided by your instructor to answer the following questions.
Is the incidence of COVID on campus going up or down?
Will this question be answered by a summary, a prediction, a generalization, or a causal claim?
Also discuss: what type of data can help answer this question? Consider:
06:00
One source of data:
“The following dashboard provides information on COVID-19 testing performed at University Health Services or through the PCR Home Test Vending Machines on campus. It does not capture self-reported positive tests. It provides a look at new cases and trends, at a glance.”
Formulate one claim that is supported by this data1.
03:00
05:00
All of the materials and links for the course can be found at:
Take 4 minutes to read through the syllabus and jot down at least one question that you have.
04:00
Practice by asking/answering a question on the “Syllabus Discussion” thread on Ed via the link at the top right of <https://berkeley-stat20.github.io/fall-2024/>.
Visit stat20.datahub.berkeley.edu! We will now:
Console
Environment
Editor
File Directory
Now we are going to switch over to RStudio to understand these 4 components a bit better.
Console: Where the live R session lives. Type commands into the prompt >
and press enter/return to run them. The Console is in the lower-left pane.
Environment: The space that keeps track of all of the data and objects that you have created or loaded and have access to. Found in the upper right pane.
Editor: Used to compose and edit text (.qmd files) and R code (.r files). Found in the upper left pane.
File Directory: Used to navigate between your files/folders on your Rstudio account. Can move, copy, rename, delete, etc. Found in the lower right pane.
R allows all of the standard arithmetic operations.
R allows all of the standard arithmetic operations.
What is three times one point two raised to the quantity thirteen divided six?
01:00