Bayesian statistics is a branch of stats that contrasts with frequentist statistics. Unlike frequentism, a Bayesian starts with a prior (an initial estimate of the probability of an event occurring), and then iteratively updates that prior in response to new information coming in.
To fully understand Bayesian stats, it makes sense to start with frequentism. A frequentist seeks to estimate the probability of an event by simply looking at the probability that an event has occurred in the past. Suppose we are trying to estimate the probability that the incumbent President will win re-election. A naive frequentist will look at all U.S. presidential elections, see that the incumbent has run in 33 elections and won 22 of them, resulting in a rate of 66.7%. This is a bit of simplification–choosing the right sample of evaluation (perhaps one only wants to consider presidential elections since 1900 for instance) is still a tricky problem and a frequentist may also want include other factors in their analysis (for example, they might want to evaluate the probability conditional on the Presidential approval rating).
In contrast, a Bayesian starts with a prior probability and then updates that prior in response to new information. Suppose they wanted to answer whether the President would win re-election. They might start with a prior of 66.7% (since that is the base rate). They then might add in the approval rating, polling, economic factors and other considerations to nudge that probability to a posterior probability.
These adjustments are not made ad hoc. Bayesians follow a specific formula known as Bayes’ Rule. Bayes Rule is as follows:
P(A|B) =P(B|A)*P(A)/P(B)
The P() means “probability of” and “|” term means “conditional on”. So P(A|B), in English, means “the probability that A will occur, given B”. For example, you may wish to know the answer to the “probability that my positive COVID-19 test is a false positive”. In that case, A is “I do not have COVID-19” and B is “I have received a positive test for COVID-19”. Suppose the test is 99% accurate. In that case, P(B|A) (the probability that you test positive, conditional on not having COVID-19) is 1%. P(A) is the share of the population that actively has COVID-19 at any moment. Suppose that there’s a massive outbreak and 3.2 million Americans (1%) of all Americans have COVID-19 at that moment, so P(A) is 99% (the base-rate probability of not having COVID-19). P(B) is the base-rate probability of testing positive, which is roughly 2% (1% for the 99% who do not have COVID-19, and 99% for the 1% of the population who does have COVID). In that case P(A|B) = 99%. Then the equation becomes 1%*99%/2% = 49.5%. Thus even with a 99% accuracy rate, the false positivity rate could still be 50%!
That’s why a good Bayesian rarely stops with one iteration. You may want to add other conditionals, such as “conditional on having symptoms”, “conditional on being a close contact”, etc. that can dramatically change your ultimate (posterior) probability. In short, unlike a frequentist, a Bayesian starts with a base rate and then updates as more facts come in.
Bayesian thinking is widely considered to be a highly effective tactic for accurate forecasting. Leading forecasting intellectuals like Professor Philip Tetlock recommend extensively researching a good prior (base rate) and then meticulously updating on new information. The prior (base rate) is often called the “outside view”. Two common pitfalls are “insufficiently updating when new information comes in” and “over-reacting to new or specific information and prematurely discarding the base rate”. A good Bayesian systematically develops an informed prior and updates according to proper statistical rules.
Consider the following example: suppose you want to estimate the probability of a recession occurring in a given year. First you want to develop a base rate. In the last 100 years, 21 years have had a recession so 21% is a reasonable base rate. Of course, the choice of “last 100 years” is a bit arbitrary–maybe you want to choose a longer or shorter time period. Or maybe you want to choose “number of non-recessionary years in which a recession started” (so the Great Depression and Great Recession aren’t counted several times). All of these would be legitimate places to start with a prior. This forms the prior, aka the base rate aka the “outside view”. You then start adding in specific information, such as the unemployment rate, the yield curve, inflation, statements from the Fed, government response, etc. to update your views by using Bayes’ Rule.