Lecture 2 – Association and Causality

Association and causation

The following headline, in Everyday Health, is about a review published in July 2020 in the European Journal of Preventive Cardiology.

Some terminology:

The first question

Is there any relation between chocolate consumption 🍫 and heart disease ❤️?

Association is another term for "any relation" or "link" 🔗.

Some data

Researchers examined [...] a total of 336,289 participants [...] which found that eating any kind of chocolate more than once per week was linked with an 8 percent reduced risk of coronary artery disease.

The second question

Does chocolate consumption 🍫 lead to a reduction in heart disease ❤️?

This is called causation or a "causal" relation.

More headlines

Other headlines about the same research article:

What can you say about the relationship between chocolate consumption 🍫 and a reduction in heart disease ❤️?

A. The data shows that there is an association and this is a causal link. Eating chocolate reduces the risk of heart disease.

B. The data shows evidence of an association but not causation.

C. The data doesn't necessarily show an association, as there could be another explanation for these results not considered here.

Case study: London in 1854

Miasmas, miasmatism, miasmatists

John Snow, 1813-1858 ❄️

Map of SoHo, London

Each bar represents a death by cholera. What do you notice?

Broad Street Pump

Now the site of a pub 🍻.

Establishing causation

Comparision ⚖️

Which houses 🏠 were part of the treatment group?

A. All houses in the region of overlap.

B. Houses served by S&V (dirty water) in the region of overlap.

C. Houses served by Lambeth (clean water) in the region of overlap.

Snow's "Grand Experiment"

“… there is no difference whatever in the houses or the people receiving the supply of the two Water Companies, or in any of the physical conditions with which they are surrounded …”

In other words, the two groups were similar except for the treatment.

Snow collected this data:

Does dirty water cause cholera?

A. Yes ✔️, I think so.

B. No ❌, I don't think so.

C. Maybe ❔, I can't tell.

Key to establishing causality 🗝️

If the treatment and control groups are similar apart from the treatment, then the differences between the outcomes in the two groups can be ascribed to the treatment.

Confounding factors

Trouble ⚠️

If the treatment and control groups have systematic differences other than the treatment, then it might be difficult to identify causality.

Randomize! 🎲


Regardless of what the dictionary says...

In probability theory, random ≠ haphazard!

Which of these questions would we not be able to answer by setting up a randomized controlled trial?

A. Does daily meditation 😌 reduce anxiety?

B. Does playing video games 🎮 increase aggressive behavior?

C. Does smoking cigarettes 🚬 cause weight loss?

D. Does early exposure to classical music 🎻 increase a person’s IQ?

Ethical and practical limitations of establishing causality


Summary: cause and effect

Comparison ⚖️

Confounding 😕

Randomize! 🎲

Next time

On Wednesday, we'll switch gears and start programming 💻 in Python 🐍.

Further reading 📖:

