Probability

Conditional probability & Bayes

15 min

Learning goals

•You can correctly distinguish conditional probabilities from their reversals.
•You can intuitively derive Bayes' formula via a hypothetical cohort.
•You can explain why a second confirmation test is essential for rare conditions.

P(disease | test +) – PPV

7.5 %

7.20 of 96.5 positive findings are real.

P(healthy | test −) – NPV

99.9 %

903 of 904 negative findings are real.

Frequency tree

A cohort of 1,000 people, broken down step by step.

2 × 2 contingency table

True state (rows) × test result (columns), at N = 1,000.

Bayes' formula with values substituted

We are looking for P(disease | +) = P(+ | disease) · P(disease) / P(+).

P(D | +) = 90.0 % · 0.80 % / [90.0 % · 0.80 % + 9.0 % · 99.2 %]

P(D | +) = 0.0072 / [0.0072 + 0.0893]

P(D | +) = 7.46 %

Dot grid: 1,000 hypothetical people

Each dot stands for one person. True positives are what we are looking for — false positives are the problem.

TP – true positive (7)FN – false negative (1)FP – false positive (89)TN – true negative (903)

What does it tell us?

→Base-rate neglect: Even though sensitivity is 90.0 %, fewer than half of the positive findings are real. That is the effect of a rare disease — most positive results are false positives (89.3 of 96.5).
→Raising the prevalence — say by targeted screening in a risk group — usually boosts the PPV dramatically, even though the test's quality is unchanged. This is exactly why mass screening for rare conditions is tricky (Gigerenzer & Hoffrage 1995).