Probability
Conditional probability & Bayes
15 min
Learning goals
- •You can correctly distinguish conditional probabilities from their reversals.
- •You can intuitively derive Bayes' formula via a hypothetical cohort.
- •You can explain why a second confirmation test is essential for rare conditions.
P(disease | test +) – PPV
7.5 %
7.20 of 96.5 positive findings are real.
P(healthy | test −) – NPV
99.9 %
903 of 904 negative findings are real.
Frequency tree
A cohort of 1,000 people, broken down step by step.
2 × 2 contingency table
True state (rows) × test result (columns), at N = 1,000.
| Test + | Test − | Σ | |
|---|---|---|---|
| Diseased | 7.20TP | 0.80FN | 8.00 |
| Healthy | 89.3FP | 903TN | 992 |
| Σ | 96.5 | 904 | 1,000 |
Bayes' formula with values substituted
We are looking for P(disease | +) = P(+ | disease) · P(disease) / P(+).
P(D | +) = 90.0 % · 0.80 % / [90.0 % · 0.80 % + 9.0 % · 99.2 %]
P(D | +) = 0.0072 / [0.0072 + 0.0893]
P(D | +) = 7.46 %
Dot grid: 1,000 hypothetical people
Each dot stands for one person. True positives are what we are looking for — false positives are the problem.
TP – true positive (7)FN – false negative (1)FP – false positive (89)TN – true negative (903)
What does it tell us?
- →Base-rate neglect: Even though sensitivity is 90.0 %, fewer than half of the positive findings are real. That is the effect of a rare disease — most positive results are false positives (89.3 of 96.5).
- →Raising the prevalence — say by targeted screening in a risk group — usually boosts the PPV dramatically, even though the test's quality is unchanged. This is exactly why mass screening for rare conditions is tricky (Gigerenzer & Hoffrage 1995).