Probability Foundations

STAT 20: Introduction to Probability and Statistics

Probability of two events…

xkcd comic showing two people discussing what it means to have a 50-50 chance

https://imgs.xkcd.com/comics/prediction.png

Maybe a bit of discussion here about situations in which two things can happen, but what do they think about the chance of each eg. coin toss, even/odd number on a die roll, winning the lottery vs not winning, getting an A in stat 20 vs not etc.

Concept review: Rules

Rules of probability

Let Ω be the outcome space, and let P(A) denote the probability of the event A. Then we have:

  1. P(A)≥0
  1. P(Ω)=1
  1. If A and B are mutually exclusive (A∩B={}), then P(A∪B)=P(A)+P(B)

Concepts to review: (KEEP THIS BRIEF, OR INCORPORATE INTO CQ’s) - The first two rules of probability - unions and intersections - mutually exclusive events and the addition rule - good idea to draw Venn diagrams here - Use rule 3 to write down the complement rule, and show what A^C means

Probability refresher

05:00

Give them five minutes, can use kahoot music

Concept Question 1

The Linda Problem

01:00

The Linda problem is from a very famous experiment conducted by Daniel Kahneman and Amos Tversky in 1983 (The version below is from the book Thinking, Fast and Slow by Kahneman, page 156):

Linda is thirty-one years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in antinuclear demonstrations.

Which alternative is more probable?

  1. Linda is a bank teller.

  2. Linda is a bank teller and is active in the feminist movement.

Correct answer: (a) Depending on the response, you can discuss how even though (b) is clearly contained in (a) and therefore has lower probability, an overwhelming majority of their respondents ranked (b) as more likely. “About 85% to 90% of undergraduates at several major universities chose the second option, contrary to logic”, and talk about why this is so. Probability can be tricky and counter-intuitive. If they do well, congratulate them and say that they are among the rare people who understand that P(A and B) must be lower than P(A).

Kahneman, Daniel. Thinking, Fast and Slow (p. 158). Farrar, Straus and Giroux.

Coin tosses: 10 times

We can simulate coin tosses and see if the simulations justify our intuitive understanding of what happens when we toss a fair coin. Go over code and review set.seed() and sample(). Maybe change arguments and see what happens.

set.seed(12345)

coin <- c("Heads", "Tails")
tosses <- sample(coin, 10, replace = TRUE)
data.frame(tosses) |>
  group_by(tosses) |> 
  summarise(n = n())
# A tibble: 2 × 2
  tosses     n
  <chr>  <int>
1 Heads      3
2 Tails      7

Coin tosses: 50 times

Comment on the fact that it is not a 50-50 split.

set.seed(12345)

tosses <- sample(coin, 50, replace = TRUE)
data.frame(tosses) |>
  group_by(tosses) |> 
  summarise(n = n())
# A tibble: 2 × 2
  tosses     n
  <chr>  <int>
1 Heads     15
2 Tails     35

Coin tosses: 500 times

Maybe tossing five hundred times will improve the split:

set.seed(12345)

tosses <- sample(coin, 500, replace = TRUE)
data.frame(tosses) |>
  group_by(tosses) |> 
  summarise(n = n())
# A tibble: 2 × 2
  tosses     n
  <chr>  <int>
1 Heads    251
2 Tails    249

We see that as the number of tosses increases, the split of heads and tails begins to look closer to 50-50.

Looking at proportions:

Here is a plot of the proportion of tosses that land heads when we toss a coin n times, where n varies from 1 to 1000.

Be sure to explain what they are looking at. We see that at the beginning, for a small number of tosses, the proportion of times that the coin lands heads is all over the place, but it eventually settles down to be around 0.5. This verifies our intuition that if we toss a fair coin, the proportion of times that the coin lands heads should be about 0.5. This idea of the probability of a particular outcome as a long run proportion of times that we see that outcome is called the frequentist theory of probability, and we will be using this theory in our class. (A different theory of probability uses a subjective notion of probability, but we won’t get into that at this time.) We will think about the probability of heads as the long-run relative frequency, or the proportion of times the coin will land heads if we toss it many, many times. This fits with our intuition that if we have a fair coin, that means that each of the two possible outcomes will occur roughly the same number of times when we toss it over and over again. This is the justification for calling the two outcomes equally likely and allowing us to define the probability of heads to be 1/2.

Concept Question 2

01:00

Suppose Ali and Bettina are playing a game, in which Ali tosses a fair coin n times, and Bettina wins one dollar from Ali if the proportion of heads is less than 0.4. Ali lets Bettina decide if n is 10 or 100.

Which n should Bettina choose?

This ties into the plot of proportions of heads as the number of coin tosses increases. Hopefully Bettina realizes she has a better chance of getting the number of heads far away from 0.5 with fewer tosses.

Concept Question 3

Part 1: Suppose we roll a die 4 times. The chance that we see six (the face with six spots) at least once is given by 16+16+16+16=46=23

True or false?


Part 2: Suppose we roll a pair of dice 24 times. The chance that we see a pair of sixes at least once is given by 24×136=2436=23

True or false?

01:00

Both parts are false since we are using the addition rule on events that are not mutually exclusive. It is important that you do NOT talk about what the actual probability is since that uses the multiplication rule and we have not discussed that in the notes. This is more an exercise in when not to use the addition rule.

Concept Question 4

01:00

Consider the Venn diagram below, which has 20 possible outcomes in Ω, depicted by the purple dots. Suppose the dots represent equally likely outcomes. What is the probability of A or B or C? That is, what is P(A∪B∪C)?

Ask them what is the quickest way to do this. Maybe even give them only 15 or 30 seconds.

Activity: Coin tossing

01:00

This activity is taken from the book Teaching Statistics - A Bag of Tricks by Nolan and Speed (Section 8.3.2). The goal is to point out to students the difficulties in trying to make up a random sequence. It highlights their “common misconceptions about randomness” and shows them that haphazard is not the same as random, while motivating the idea of sampling distributions.

The demonstration to be done in class is as follows: -

  • Divide your class into no more than 10 groups. This is quite important, since your chance of making a mistake increases with the number of groups. The size of the groups will depend on your attendance today.

  • Further divide each group into two subgroups.

  • For each group; instruct one of the subgroups to flip a quarter 100 times and record the results as a sequence of 0’s and 1’s, with 1 representing the coin landing heads and 0 representing the coin landing tails. Maybe tell the rest of the group to make sure that the designated flipper does good flips, with many rotations, and they catch it rather than letting it fall.

  • For the same group, instruct the other subgroup to *make up * a sequence of length 100 0’s and 1’s which is supposed to represent the result of 100 coin flips and write this on a sheet of paper - but they mustn’t actually flip a coin, or use their phone, or use their computer etc. They also mustn’t talk to the other subgroup.

  • Then have a different member of each of the subgroups goes and puts the sequence on the board, with their names. Maybe it is a good thing that the groups will be larger, so they can catch errors.

  • Now you go in and identify the real vs fake flips for each group by looking for the sequence that have long streaks of one number.

  • Once you are done, and hopefully you have impressed them with your ability to spot the fake, you can show them, to some extent, what’s going on. Have each group go to their sequence and count two quantities: the length of the longest run in the sequence, and the number of switches from 0 to 1 or the other way (number of runs = number of switches + 1).

  • Then you can hand out copies of the plot I showed you yesterday of the simulated flips showing the distribution of these pairs of numbers, or display it. Get each group to come and tell you their values of the quantity (number of runs, longest run ).

  • It should not take too long to annotate with the at most 20 points.

Good luck! I am worried that we will have too many groups and make mistakes, which will reduce the dramatic impact.

1
Probability Foundations STAT 20: Introduction to Probability and Statistics

  1. Slides

  2. Tools

  3. Close
  • Probability Foundations
  • Probability of two events…
  • Concept review: Rules
  • Rules of probability
  • Probability refresher
  • Concept Question 1
  • The Linda Problem
  • Coin tosses: 10 times
  • Coin tosses: 50 times
  • Coin tosses: 500 times
  • Looking at proportions:
  • Concept Question 2
  • − + 01:00 Suppose...
  • Concept Question 3
  • Part 1: Suppose we...
  • Concept Question 4
  • − + 01:00 Consider...
  • Activity: Coin tossing
  • Slide 19
  • Slide 20
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • ? Keyboard Help