# Run this cell to set up packages for lecture.
from lec19_imports import *
"... the overall percentage disparity has been small...”
np.random.choice won't help us, because we don't know how large the eligible population is.np.random.multinomial helps us sample at random from a categorical distribution.np.random.multinomial(sample_size, pop_distribution)
np.random.multinomial samples at random from the population, with replacement, and returns a random array containing counts in each category.pop_distribution needs to be an array containing the probabilities of each category.Aside: Example usage of np.random.multinomial
On Halloween 👻, you trick-or-treated at 35 houses, each of which had an identical candy box, containing:
At each house, you selected one candy blindly from the candy box.
To simulate the act of going to 35 houses, we can use np.random.multinomial:
np.random.multinomial(35, [0.3, 0.3, 0.4])
array([12, 8, 15])
In our case, a randomly selected member of our population is Black with probability 0.26 and not Black with probability 1 - 0.26 = 0.74.
demographics = [0.26, 0.74]
Each time we run the following cell, we'll get a new random sample of 100 people from this population.
np.random.multinomial(100, demographics)
array([22, 78])
We also need to calculate the statistic, which in this case is the number of Black men in the random sample of 100.
np.random.multinomial(100, demographics)[0]
26
counts.counts = np.array([])
for i in np.arange(10000):
new_count = np.random.multinomial(100, demographics)[0]
counts = np.append(counts, new_count)
counts
array([28., 36., 25., ..., 25., 24., 23.])
Was a jury panel with 8 Black men suspiciously unusual?
(bpd.DataFrame().assign(count_black_men=counts)
.plot(kind='hist', bins = np.arange(9.5, 45, 1),
density=True, ec='w', figsize=(10, 5),
title='Empiricial Distribution of the Number of Black Men in Simulated Jury Panels of Size 100'));
observed_count = 8
plt.axvline(observed_count, color='black', linewidth=4, label='Observed Number of Black Men in Actual Jury Panel')
plt.legend();
# In 10,000 random experiments, the panel with the fewest Black men had how many?
counts.min()
10.0
Let's suppose we find a coin on the ground and we aren't sure whether it's a fair coin.
Out of curiosity (and boredom), we flip it 400 times.
flips_400 = bpd.read_csv('data/flips-400.csv')
flips_400
| flip | outcome | |
|---|---|---|
| 0 | 1 | Tails |
| 1 | 2 | Tails |
| 2 | 3 | Tails |
| ... | ... | ... |
| 397 | 398 | Heads |
| 398 | 399 | Heads |
| 399 | 400 | Tails |
400 rows × 2 columns
flips_400.groupby('outcome').count()
| flip | |
|---|---|
| outcome | |
| Heads | 188 |
| Tails | 212 |
How "weird" is it to flip a fair coin 400 times and see only 188 heads?
- "This jury panel was selected at random" or "this jury panel was not selected at random, since there weren't enough Black men on it."
- "This coin is fair" or "this coin is not fair."
n flips of a fair coin using np.random.multinomial(n, [0.5, 0.5]), but we can't simulate n flips of an unfair coin.In the jury panel example, we repeatedly drew samples from a population that was 26% Black. In our current example, we'll repeatedly flip a fair coin 400 times.
In the jury panel example, this was the number of Black men. In our current example, a reasonable choice is the number of heads.
Think of the test statistic a number you write down each time you perform an experiment.
In our current example, the observed statistic is 188.
# Computes a single simulated test statistic.
np.random.multinomial(400, [0.5, 0.5])[0]
198
# Computes 10,000 simulated test statistics.
results = np.array([])
for i in np.arange(10000):
result = np.random.multinomial(400, [0.5, 0.5])[0]
results = np.append(results, result)
results
array([183., 207., 224., ..., 204., 211., 197.])
Let's visualize the empirical distribution of the test statistic $\text{number of heads}$.
bpd.DataFrame().assign(results=results).plot(kind='hist', bins=np.arange(160, 240, 4),
density=True, ec='w', figsize=(10, 5),
title='Empirical Distribution of the Number of Heads in 400 Flips of a Fair Coin');
plt.legend();
If we observed close to 200 heads, we'd think our coin is fair.
There are two cases in which we'd think our coin is unfair:
This means that the histogram above is divided into three regions, where two of them mean the same thing (we think our coin is unfair).
It would be a bit simpler if we had a histogram that was divided into just two regions. How do we create such a histogram?

Let's define a function that computes a single value of our test statistic. We'll do this often moving forward.
def num_heads_from_200(arr):
return abs(arr[0] - 200)
num_heads_from_200([188, 212])
12
Now, we'll repeat the act of flipping a fair coin 10,000 times again. The only difference is the test statistic we'll compute each time.
results = np.array([])
for i in np.arange(10000):
result = num_heads_from_200(np.random.multinomial(400, [0.5, 0.5]))
results = np.append(results, result)
results
array([24., 22., 10., ..., 17., 11., 4.])
Let's visualize the empirical distribution of our new test statistic, $| \text{number of heads} - 200 |$.
bpd.DataFrame().assign(results=results).plot(kind='hist', bins=np.arange(0, 60, 4),
density=True, ec='w', figsize=(10, 5),
title=r'Empirical Distribution of | Num Heads - 200 | in 400 Flips of a Fair Coin');
plt.axvline(12, color='black', linewidth=4, label='observed statistic (12)')
plt.legend();
In black, we've drawn our observed statistic, 12. Does 12 seem like a reasonable value of the test statistic – that is, how rare was it to see a test statistic of 12 or larger in our simulations?
Since we're using the number of heads as our test statistic again, our simulation code is the same as it was originally.
results = np.array([])
for i in np.arange(10000):
result = np.random.multinomial(400, [0.5, 0.5])[0]
results = np.append(results, result)
results
array([199., 203., 204., ..., 198., 191., 196.])
Let's visualize the empirical distribution of the test statistic $\text{number of heads}$, one more time.
bpd.DataFrame().assign(results=results).plot(kind='hist', bins=np.arange(160, 240, 4),
density=True, ec='w', figsize=(10, 5),
title='Empirical Distribution of the Number of Heads in 400 Flips of a Fair Coin');
plt.axvline(188, color='black', linewidth=4, label='observed statistic (188)')
plt.legend();
We'll look more closely at how to draw a conclusion from a hypothesis test and come up with a more precise definition of being "consistent" with the empirical distribution of test statistics generated according to the model.