Lecture 04 - Probability and Bayes' Theorem

Meta

Key Topics

Bayes' Theorem Data cleaning dplyr Git GitHub Probability Probability operations R Reproducible examples

Resources

Open on   Open on   View Lecture   Equations Lab 03 Lab 03 Replication Lecture Prep 04

Lecture Prep Replication

Lecture Slides

Probability in the News

FiveThirtyEight published a great article on the use probability in news media coverage in 2017. The article raises important points about how error in public polling is communicated:

As I’ve documented throughout this series, polls and other data did not support the exceptionally high degree of confidence that news organizations such as The New York Times regularly expressed about Hillary Clinton’s chances.

One of the key takeaways for me was this line:

What can also get lost is that election forecasts — like hurricane forecasts — represent a continuous range of outcomes, none of which is likely to be exactly right.

When we talk about the likelihood of an event, we are really trying to summarize a range of possible outcomes. Keep that in mind as we progress through the semester!

John Edmund Kerrich

One of the important statisticians we discussed this week was John Edmund Kerrich. You can explore Kerrich’s data using testDriveR:

library(testDriveR)

coinFlips <- kerrich

ggplot(data = coinFlips) +
  geom_hline(mapping = aes(yintercept = .5, color = "p(heads)")) +
  geom_line(mapping = aes(x = id, y = average)) +
  ylim(0,1)

kerrichPlot

The ggplot2 syntax above adds only a slight increase in complexity from your previous plots. We’ve added a second geom that is layered under the primary geom_line() that we discussed during Lecture 02. This geom_hline() allows us to add a horizontal line at y = .5, which is the predicted probability of either outcome of a fair coin flip. We’ve also modified the y-axis so that it runs from 0 to 1, covering the full range of possible probabilities.

Extra Information

This week, I mentioned a number of important statisticians. If you want more information, you can check out these Wikipedia pages:

We also talked about a number of statistical fallacies. One of them was Meadow’s Law, which you can read about on Wikipedia and in various news articles, including this one from 60 Minutes.