Lecture 04 - Probability and Bayes' Theorem
Meta
Key Topics
Bayes' Theorem Data cleaning dplyr Git GitHub Probability Probability operations R Reproducible examples
Resources
Open on Open on View Lecture Equations Lab 03 Lab 03 Replication Lecture Prep 04
Lecture Prep Replication
Lecture Slides
Probability in the News
FiveThirtyEight published a great article on the use probability in news media coverage in 2017. The article raises important points about how error in public polling is communicated:
As I’ve documented throughout this series, polls and other data did not support the exceptionally high degree of confidence that news organizations such as The New York Times regularly expressed about Hillary Clinton’s chances.
One of the key takeaways for me was this line:
What can also get lost is that election forecasts — like hurricane forecasts — represent a continuous range of outcomes, none of which is likely to be exactly right.
When we talk about the likelihood of an event, we are really trying to summarize a range of possible outcomes. Keep that in mind as we progress through the semester!
John Edmund Kerrich
One of the important statisticians we discussed this week was John Edmund Kerrich. You can explore Kerrich’s data using testDriveR
:
library(testDriveR)
coinFlips <- kerrich
ggplot(data = coinFlips) +
geom_hline(mapping = aes(yintercept = .5, color = "p(heads)")) +
geom_line(mapping = aes(x = id, y = average)) +
ylim(0,1)
The ggplot2
syntax above adds only a slight increase in complexity from your previous plots. We’ve added a second geom that is layered under the primary geom_line()
that we discussed during Lecture 02. This geom_hline()
allows us to add a horizontal line at y = .5
, which is the predicted probability of either outcome of a fair coin flip. We’ve also modified the y-axis so that it runs from 0 to 1, covering the full range of possible probabilities.
Extra Information
This week, I mentioned a number of important statisticians. If you want more information, you can check out these Wikipedia pages:
We also talked about a number of statistical fallacies. One of them was Meadow’s Law, which you can read about on Wikipedia and in various news articles, including this one from 60 Minutes.