# Lecture 12 – Simulations¶

## DSC 10, Spring 2023¶

### Announcements¶

• Lab 3 is due tomorrow at 11:59PM.
• Homework 3 is due on Tuesday 5/2 at 11:59PM.
• The Midterm Project is due on Tuesday, 5/9 at 11:59PM.
• Check out this post on Ed for explanations about common misconceptions from office hours.
• The Midterm Exam is on Friday 5/5 during your assigned lecture.

### Midterm Exam details¶

The Midterm Exam is on Friday 5/5 during your assigned lecture.

• It will be a 50 minute, on-paper, closed-notes exam. We will provide you with first 2 pages of the reference sheet.
• It will consist of multiple choice, fill-in-the-blank code, and short answer questions.
• Bring a pen/pencil/eraser and a photo ID. No scantron or blue book needed.
• No calculator, computers, notes, or other aids are allowed.
• You will be assigned a specific seat sometime next week.
• Today's material is on the midterm; next week's is not.
• 🚨 Look at past midterm exams at practice.dsc10.com!

### Agenda¶

Simulations.

• Example: What's the probability of getting 60 or more heads if we flip 100 coins?
• Example: The "Monty Hall" Problem.

## Simulations¶

### Simulations¶

• What is the probability of getting 60 or more heads if we flip 100 coins?
• While we could calculate it by hand (and will learn how to in future courses), we can also estimate it using the computer:
1. Figure out how to run the experiment (flipping 100 coins) once.
2. Repeat the experiment many times.
3. Find the proportion of experiments in which the number of heads was 60 or more.
• This is how we'll use simulations – to estimate, or approximate, a probability through computation.
• The techniques we will introduce in today's lecture will appear in almost every lecture for the remainder of the quarter! ### Making a random choice¶

• To simulate, we need a way to perform a random experiment on the computer (e.g. flipping a coin, rolling a die).
• A helpful function is np.random.choice(options).
• The input, options, is a list or array to choose from.
• The output is a random element in options. By default, all elements are equally likely to be chosen.

### Making multiple random choices¶

np.random.choice(options, n) will return an array of n randomly selected elements from options.

### With replacement vs. without replacement¶

• By default, np.random.choice selects with replacement.
• That is, after making a selection, that option is still available.
• e.g. if every time you draw a marble from a bag, you put it back.
• If an option can only be selected once, select without replacement by specifying replace=False.
• e.g. if every time you draw a marble from a bag, you do not put it back.

## Example: What's the probability of getting 60 or more heads if we flip 100 coins?¶

### Flipping coins¶

What is the probability of getting 60 or more heads if we flip 100 coins?

Plan:

1. Figure out how to run the experiment (flipping 100 coins) once.
2. Repeat the experiment many times.
3. Find the proportion of experiments in which the number of heads was 60 or more.

### Step 1: Figure out how to run the experiment once¶

• Use np.random.choice to flip 100 coins.
• Use np.count_nonzero to count the number of heads.
• np.count_nonzero(array) returns the number of entries in array that are True.
• Question: Why is it called count_nonzero?
• Answer: In Python, True == 1 and False == 0, so counting the non-zero elements counts the number of Trues.

### Aside: Defining a function to run the experiment¶

This makes it easy to run the experiment repeatedly.

### Step 2: Repeat the experiment many times¶

• How do we run a piece of code many times? Using a for-loop!
• Each time we run the experiment, we'll need to store the results in an array.
• To do this, we'll use np.append!

### Step 2: Repeat the experiment many times¶

• Imagine we start with a blank sheet of paper, and each time we run the experiment, we write the number of heads we see down on the sheet of paper.
• The sheet will start off empty, but eventually will have one number for each time we ran the experiment.

### Step 3: Find the proportion of experiments in which the number of heads was 60 or more¶

This is quite close to the true theoretical answer!

### Visualizing the distribution¶

• This histogram describes the distribution of the number of heads in each experiment.
• Now we see another reason to use density histograms.
• Using density means that areas are probabilities.

## Example: The "Monty Hall" Problem¶

### The "Monty Hall" Problem¶

Suppose you’re on a game show, and you’re given the choice of three doors. A car 🚗 is behind one of the doors, and goats 🐐🐐 are behind the other two.

• You pick a door, say Door #2, and the host, who knows what’s behind the doors, opens another door, say Door #3, which has a goat.

• The host then says to you, “Do you want to switch to Door #1 or stay with Door #2?”

• Question: Should you stay or switch?

(The question was posed in Parade magazine’s "Ask Marilyn" column in 1990. It is called the "Monty Hall problem" because Monty Hall hosted a similar game show called "Let's Make a Deal.")

### Let's play!¶

Below, we've embedded the Monty Hall simulator from this site.

### Concept Check ✅ – Answer at cc.dsc10.com¶

Suppose you originally selected Door #2. The host reveals Door #3 to have a goat behind it. What should you do?

A. Stay with Door #2; it has just as high a chance of winning as Door #1. It doesn't matter whether you switch or not.

B. Switch to Door #1; it has a higher chance of winning than Door #2.

### Time to simulate!¶

• Let's estimate the probability of winning if you switch.
• If it's higher than 50%, then switching is the better strategy, otherwise staying is the better strategy.

Plan:

1. Figure out how to simulate a single game.
2. Play the game many times, switching each time.
3. Compute the proportion of wins.

### Step 1: Simulate a single game¶

When you pick a door, there are three equally-likely outcomes:

• Car.
• Goat #1.
• Goat #2.

### Step 1: Simulate a single game¶

When the host opens a different door, they always reveal a goat.

If you always switch, you'll end up winning the prize that is neither behind_picked_door nor revealed.

### Step 1: Simulate a single game¶

Let's put all of our work into a single function to make it easier to repeat.

Now, every time we call simulate_switch_strategy, the result is your prize.

### Step 2: Play the game many times¶

We should save your prize in each game; to do so, we'll use np.append.

### Step 3: Count the proportion of wins for this strategy (switching)¶

This is quite close to the true probability of winning if you switch, $\frac{2}{3}$.

### Alternate implementation¶

• Looking back at our implementation, we kept track of your prize in each game.
• However, all we really needed to keep track of was the number of games in which you won a car.
• 💡 Idea: Keep a tally of the number of times you won a car. That is, initialize car_count to 0, and add 1 to it each time your prize is a car.

No arrays needed! This strategy won't always work; it depends on the goal of the simulation.

### What if you always stay with your original door?¶

In this case, your prize is always the same as what was behind the picked door.

• This is quite close to the true probability of winning if you stay, $\frac{1}{3}$.
• Conclusion: It's better to switch.
• Why?
• If you originally choose a goat, Monty will reveal the other goat, and you'll win the car by switching.
• If you originally choose a car, you'll win by staying.
• But there are 2 goats and only 1 car, so you win twice as often by switching.

### Marilyn vos Savant's column in Parade magazine¶

• She received over 10,000 letters in disagreement, including over 1,000 letters from people with Ph.D.s.
• This became a nationwide controversy, even getting a front-page New York Times article in 1991. ## Summary, next time¶

### Simulations find probabilities¶

• Calculating probabilities is important, but can be hard!
• You'll learn plenty of formulas in future DSC classes, if you end up taking them.
• Simulations let us find probabilities through code rather than through math.
• Many real-world scenarios are complicated.
• Simulations are much easier than math in many of these cases.

### The simulation "recipe"¶

To estimate the probability of an event through simulation:

1. Make a function that runs the experiment once.
2. Run that function many times (usually 10,000) with a for-loop, and save the results in an array with np.append.
3. Compute the proportion of times the event occurs using np.count_nonzero.

### What's next?¶

• In the next class, we will start talking about sampling.
• Key idea: We want to learn something about a large population (e.g. all undergraduates at UCSD). However, it's far too difficult to survey everyone. If we collect a sample, what can we infer about the larger population?
• Next week's lecture material is not on the midterm.