Session 7
Bayes’ Brain
Bayesian Thinking and Updating Beliefs
Concept Lesson
Imagine you are a doctor at a clinic in Abuja. A patient walks in, takes a screening test for a rare disease, and the result comes back positive. How worried should you actually be? Your instinct might say 95% worried if the test is 95% accurate, but as you learned in Session 6, the real answer can be shockingly lower when the disease is rare. Bayes’ theorem is the formal tool that gives you the correct answer every time, and it works by forcing you to separate three distinct pieces of information: your prior belief (how common is the disease before the test?), the evidence (how likely is a positive result if the patient is actually sick?), and the total probability of seeing the evidence at all (how many positive results come from sick people versus healthy people who got false alarms?). The formula ties these together: P(disease | positive) = P(positive | disease) × P(disease) ⁄ P(positive). Do not memorize this mechanically — understand the logic. Your updated belief equals how well the evidence supports the hypothesis, scaled by how plausible the hypothesis was to begin with, divided by how surprising the evidence is overall.
Let us trace through the formula step by step with concrete numbers. You know P(disease) = 0.01 — the disease affects 1 in 100 people in Abuja. This is your prior: before any test, there is a 1% chance any given patient has it. The test sensitivity is P(positive | disease) = 0.95 — if the patient is sick, the test says positive 95% of the time. Sensitivity — also called the true positive rate — is the probability that the test correctly identifies someone who actually has the condition. If sensitivity is 95%, the test catches 95 out of 100 sick people. The false positive rate is P(positive | no disease) = 0.05 — if the patient is healthy, the test still says positive 5% of the time. The false positive rate is the chance that a healthy person gets a wrong positive result. If the false positive rate is 5%, then 5 out of every 100 healthy people will be incorrectly told they are sick. Now you need P(positive), the total probability of a positive result. This comes from the law of total probability: P(positive) = P(positive | disease) × P(disease) + P(positive | no disease) × P(no disease) = (0.95 × 0.01) + (0.05 × 0.99) = 0.0095 + 0.0495 = 0.059. Plugging into Bayes: P(disease | positive) = (0.95 × 0.01) ⁄ 0.059 = 0.0095 ⁄ 0.059 = 0.161, or about 16.1%. The 95% accurate test only gives you a 16% chance of actually being sick. The reason is that the flood of false positives from the 99% healthy population drowns out the true positives from the 1% who are actually sick.
This is not just a medical curiosity — it is the exact challenge every ML system faces when dealing with rare events. A fraud detector monitoring ₦4 billion in daily mobile money transactions across Nigeria might face a fraud rate of 0.5%. Even with a model that correctly catches 90% of fraud (sensitivity) and has only a 2% false positive rate, Bayes’ theorem tells you that a flagged transaction actually has only about a 18.5% chance of being real fraud: P(fraud | flagged) = (0.90 × 0.005) ⁄ [(0.90 × 0.005) + (0.02 × 0.995)] = 0.0045 ⁄ (0.0045 + 0.0199) = 0.0045 ⁄ 0.0244 = 18.4%. That means over 80% of flagged transactions are legitimate customers being wrongly accused. This is why banks use multi-stage screening: the first filter catches broad patterns, then a second system reviews only the flagged cases, progressively updating the belief at each stage. Each stage is a Bayesian update, and the posterior (your updated belief) from one stage becomes the prior (your starting belief) for the next. This is similar to how a doctor updates their diagnosis: they start with a prior belief about what disease the patient might have (based on symptoms and prevalence), then update that belief when lab results come in (the evidence). New test results can shift the diagnosis dramatically — or barely at all — depending on how consistent the evidence is with each possible condition. Bayesian thinking is not a niche academic topic; it is the logical backbone of how intelligent systems refine their understanding of the world.
The practical takeaway is this: always think about the base rate. The base rate is simply how common something is before you test for it. If 1% of the population has the disease, the base rate is 1%. This number matters enormously — a test for a rare condition will produce far more false alarms than a test for a common one, even if both tests are equally accurate. How common or rare is the thing you are trying to detect? When the base rate is very low — rare diseases, rare fraud, rare security threats — even a highly accurate model will produce mostly false alarms. This is why accuracy alone is dangerously misleading for imbalanced problems. Metrics like precision (of all flagged items, what fraction are real?) and recall (of all real cases, what fraction did we catch?) exist precisely because they separate the two types of errors that Bayes’ theorem makes visible. A model with 99% accuracy that never catches fraud has 0% recall. A model with 80% accuracy that catches 90% of fraud is far more valuable to the bank. Bayesian thinking forces you to see past surface-level accuracy and understand what a model is actually doing in context.
Guided Exercises
Discussion Prompt
How does Bayesian thinking apply to debugging an ML pipeline? Suppose your model’s accuracy drops from 94% to 71% after deployment. You suspect data drift. How would you use Bayesian reasoning to evaluate this hypothesis — what is your prior for data drift being the cause, what evidence would change your belief, and what alternative explanations (software bugs, label errors, infrastructure issues) compete as hypotheses?
Key Takeaway
Always consider the base rate. Flashy evidence is less impressive when the thing you are looking for is rare. A 95% accurate test for a 1-in-100 disease gives you only a 16% chance of being sick. This principle applies to medical tests, fraud detection, anomaly detection, and every ML system that deals with imbalanced data.