Lab: Elections

STAT 20: Introduction to Probability and Statistics

2009 Iran Election

Background

  • Ongoing public sentiment that previous election was fraudulent
  • The highest voter turnout in Iran’s history

Leading candidates

  • Mahmoud Ahmadinejad: Leader of conservatives and incumbent president.
  • Mir-Hossein Mousavi: Reformist and former prime minister. Seeking rapid political evolution.

Outcome

Ahmadinejad won the election with 62.6% of the votes cast, while Mousavi received 33.75% of the votes cast.

Post-election controversies and unrest

  • Allegations of fraud
  • Public protests and unrests
  • The green wave movement, led by Mousavi, against the allegedly fraudulent election and Ahmadinejad’s regime

Was the election fraudulent?

Benford’s Law

What is the distribution of city/town populations in all cities and towns in California?

What is the distribution of the first digit of city/town populations in all cities and towns in California?

(these extensive notes were the instructions to TAs when facilitating this spring 2022)

This first component can be done either during Berkeley time or at the start of class along with students. It depends on how long you expect the rest of the lesson to take and what you’d like to emphasize: [Write on the board the two “fundamental” distributions that we’ve covered so far: the Binomial and the Bernoulli. ?var:site-urllectures/14/binomial.html#/bernoulli-distributionl will give you a sense of the notation and layout that’s used. Best to replicate each element of these slides, including the plots.]

It’s reasonable to ask students to close their laptops for the first part of class.

Ask students for a guess or two at the population of Berkeley. Write on board then go to Wikipedia and get the true answer and correct the one(s) on the board. From there, follows links through Wikipedia to other towns and cities in California, building up a list of 10 city names and their populations (randomly picking a few of these links is a good method: https://en.wikipedia.org/wiki/Category:Incorporated_cities_and_towns_in_California). Structure this on the board as a data frame with two columns: city and population, and 10 rows.

Ask students to sketch the distribution of two variables: 1. What they expect the population variable to look like were they to collect the population of all the cities and towns in California. 2. The distribution of the value of the first digit in the population counts of all of these town. This is probably best done think-pair share: give a couple minutes to for them to sketch silently, then ask them to share with a neighbor, then ask a pair to describe their distributions as you draw it on the board.

Probing questions: 1. Where does this variable sit in the Data Taxonomy? 2. What is the range of possible values this variable can take? 3. What geometry will you use? 4. What shape/modality/center/spread would you expect to see? 5. What labels belong on the axes? 6. What is a good title for this plot?

The first should be a histogram or density plot (boxplot is meh) that is heavily right skewed. The second should be a barchart on the integers 1-9, each one having decreasing probabilities.

Write out Benford’s Law in a similar way to the other named distributions and describe it as a distribution that we’re going to try to use to describe vote counts. Start the calculation of E and V of Benford’s Law just so that students can see what goes into each term of those sums (I’m actually not sure what the base_10 Benfords E and V are).

Benford’s Law

Let X be the first digit of a randomly selected number. X∼Benfords() if

P(X=x)=log10(1+1/x)

Benfords Law and Elections

Fraud detection using Benford’s Law

  • A common theory is that in a normally occurring, fair, election, the first digit of the vote counts county-by-county should follow Benford’s Law. If they do not, that might suggest that vote counts have been manually altered.
  • This theory was brought to bear to determine whether the 2009 presidential election in Iran showed irregularities1.
  1. https://physicsworld.com/a/benfords-law-and-the-iranian-e/

Lab: Elections

In this lab we will:

  • Examine the Benford’s Law probability distribution
  • Compare the first digits of vote counts in the 2009 Iranian election to this distribution
  • Reach a conclusion on whether the election was fraudulent (or whether the Benford’s Law is a good tool at detecting fraud in the first place).

50:00
1
Lab: Elections STAT 20: Introduction to Probability and Statistics

  1. Slides

  2. Tools

  3. Close
  • Lab: Elections
  • 2009 Iran Election
  • Post-election controversies and unrest
  • Was the election...
  • Benford’s Law
  • What is the distribution...
  • What is the distribution...
  • Benford’s Law
  • Benfords Law and Elections
  • Fraud detection using Benford’s Law
  • Lab: Elections
  • − + 50:00...
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • ? Keyboard Help