# Run this cell to set up packages for lecture.
from lec10_imports import *
Agenda¶
- Booleans.
- Conditional statements (i.e.
if
-statements). - Iteration (i.e.
for
-loops).
Note:
- We've finished introducing new DataFrame manipulation techniques.
- Today we'll cover some foundational programming tools, which will be very relevant as we start to cover more ideas in statistics in the second half of the class.
Booleans¶
Recap: Booleans¶
bool
is a data type in Python, just likeint
,float
, andstr
.- It stands for "Boolean", named after George Boole, an early mathematician.
- There are only two possible Boolean values:
True
orFalse
.- Yes or no.
- On or off.
- 1 or 0.
- Comparisons result in Boolean values.
dept = 'DSC'
course = 10
course < 20
True
type(course < 20)
bool
The in
operator¶
Sometimes, we'll want to check if a particular element is in a list/array, or a particular substring is in a string. The in
operator can do this for us, and it also results in a Boolean value.
course in [10, 20, 30]
True
'DS' in dept
True
'DS' in 'Data Science'
False
Boolean operators; not
¶
There are three operators that allow us to perform arithmetic with Booleans – not
, and
, and or
.
not
flips True
↔️ False
.
dept == 'DSC'
True
not dept == 'DSC'
False
The and
operator¶
The and
operator is placed between two bool
s. It is True
if both are True
; otherwise, it's False
.
80 < 30 and course < 20
False
80 > 30 and course < 20
True
The or
operator¶
The or
operator is placed between two bool
s. It is True
if at least one is True
; otherwise, it's False
.
course in [10, 20, 30, 80] or type(course) == str
True
# Both are True!
course in [10, 20, 30, 80] or type(course) == int
True
# Both are False!
course == 80 or type(course) == str
False
course == 10 or (dept == 'DSC' and dept == 'CSE')
True
# Different meaning!
(course == 10 or dept == 'DSC') and dept == 'CSE'
False
# With no parentheses, "and" has precedence.
course == 10 or dept == 'DSC' and dept == 'CSE'
True
Note: &
and |
vs. and
and or
¶
- Use the
&
and|
operators between two Series. Arithmetic will be done element-wise (separately for each row).- This is relevant when writing DataFrame queries, e.g.
courses[(courses.get('dept') == 'DSC') & (courses.get('course') == 10)]
.
- This is relevant when writing DataFrame queries, e.g.
- Use the
and
andor
operators between two individual Booleans.- e.g.
dept == 'DSC' and course == 10
.
- e.g.
Conditionals¶
if
-statements¶
- Often, we'll want to run a block of code only if a particular conditional expression is
True
. - The syntax for this is as follows (don't forget the colon!):
if <condition>:
<body>
- Indentation matters!
capstone = 'finished'
capstone
'finished'
if capstone == 'finished':
print('Looks like you are ready to graduate!')
Looks like you are ready to graduate!
else
¶
If you want to do something else if the specified condition is False
, use the else
keyword.
capstone = 'finished'
capstone
'finished'
if capstone == 'finished':
print('Looks like you are ready to graduate!')
else:
print('Before you graduate, you need to finish your capstone project.')
Looks like you are ready to graduate!
elif
¶
- What if we want to check more than one condition? Use
elif
. elif
: if the specified condition isFalse
, check the next condition.- If that condition is
False
, check the next condition, and so on, until we see aTrue
condition.- After seeing a
True
condition, it evaluates the indented code and stops.
- After seeing a
- If none of the conditions are
True
, theelse
body is run.
capstone = 'in progress'
units = 123
if capstone == 'finished' and units >= 180:
print('Looks like you are ready to graduate!')
elif capstone != 'finished' and units < 180:
print('Before you graduate, you need to finish your capstone project and take',
180 - units, 'more units.')
elif units >= 180:
print('Before you graduate, you need to finish your capstone project.')
else:
print('Before you graduate, you need to take', 180 - units, 'more units.')
Before you graduate, you need to finish your capstone project and take 57 more units.
What if we use if
instead of elif
?
if capstone == 'finished' and units >= 180:
print('Looks like you are ready to graduate!')
if capstone != 'finished' and units < 180:
print('Before you graduate, you need to finish your capstone project and take',
180 - units, 'more units.')
if units >= 180:
print('Before you graduate, you need to finish your capstone project.')
else:
print('Before you graduate, you need to take', 180 - units, 'more units.')
Before you graduate, you need to finish your capstone project and take 57 more units. Before you graduate, you need to take 57 more units.
Example: Percentage to letter grade¶
Below, complete the implementation of the function, grade_converter
, which takes in a percentage grade (grade
) and returns the corresponding letter grade, according to this table:
Letter | Range |
---|---|
A | [90, 100] |
B | [80, 90) |
C | [70, 80) |
D | [60, 70) |
F | [0, 60) |
Your function should work on these examples:
>>> grade_converter(84)
'B'
>>> grade_converter(60)
'D'
✅ Click here to see the solution after you've tried it yourself.
def grade_converter(grade): if grade >= 90: return 'A' elif grade >= 80: return 'B' elif grade >= 70: return 'C' elif grade >= 60: return 'D' else: return 'F'
def grade_converter(grade):
...
grade_converter(84)
grade_converter(60)
Extra Practice¶
def mystery(a, b):
if (a + b > 4) and (b > 0):
return 'bear'
elif (a * b >= 4) or (b < 0):
return 'triton'
else:
return 'bruin'
Without running code:
- What does
mystery(2, 2)
return? - Find inputs so that calling
mystery
will produce'bruin'
.
def mystery(a, b):
if (a + b > 4) and (b > 0):
return 'bear'
elif (a * b >= 4) or (b < 0):
return 'triton'
else:
return 'bruin'
Iteration¶
![No description has been provided for this image](images/iteration.png)
for
-loops¶
import time
print('Launching in...')
for x in [5, 4, 3, 2, 1]:
print('t-minus', x)
time.sleep(0.5) # Pauses for half a second.
print('Blast off! 🚀')
Launching in... t-minus 5 t-minus 4 t-minus 3 t-minus 2 t-minus 1 Blast off! 🚀
for
-loops¶
- Loops allow us to repeat the execution of code. There are two types of loops in Python; the
for
-loop is one of them. - The syntax of a
for
-loop is as follows:
for <element> in <sequence>:
<for body>
- Read this as: "for each element of this sequence, repeat this code."
- Lists, arrays, and strings are all examples of sequences.
- Like with
if
-statements, indentation matters!
Activity¶
Using the array colleges
, write a for
-loop that prints:
Revelle College
John Muir College
Thurgood Marshall College
Earl Warren College
Eleanor Roosevelt College
Sixth College
Seventh College
Eighth College
✅ Click here to see the solution after you've tried it yourself.
for college in colleges: print(college + ' College')
colleges = np.array(['Revelle', 'John Muir', 'Thurgood Marshall',
'Earl Warren', 'Eleanor Roosevelt', 'Sixth', 'Seventh', 'Eighth'])
...
Ellipsis
Example: Multiplication Table¶
- We know how to print the first row of the 12x12 multiplication table, using the
multiples
function we wrote earlier.
def multiples(k):
'''This function returns the
first twelve multiples of k.'''
return np.arange(k, 13*k, k)
print(multiples(1))
[ 1 2 3 4 5 6 7 8 9 10 11 12]
- Similarly, we would print the second row with
print(multiples(2))
, and the third row withprint(multiples(1))
and so on. - We can condense all these print statements with a
for
-loop!
for i in np.arange(1, 13):
print(multiples(i))
[ 1 2 3 4 5 6 7 8 9 10 11 12] [ 2 4 6 8 10 12 14 16 18 20 22 24] [ 3 6 9 12 15 18 21 24 27 30 33 36] [ 4 8 12 16 20 24 28 32 36 40 44 48] [ 5 10 15 20 25 30 35 40 45 50 55 60] [ 6 12 18 24 30 36 42 48 54 60 66 72] [ 7 14 21 28 35 42 49 56 63 70 77 84] [ 8 16 24 32 40 48 56 64 72 80 88 96] [ 9 18 27 36 45 54 63 72 81 90 99 108] [ 10 20 30 40 50 60 70 80 90 100 110 120] [ 11 22 33 44 55 66 77 88 99 110 121 132] [ 12 24 36 48 60 72 84 96 108 120 132 144]
- The line
print(multiples(i))
is run thirteen times:- On the first iteration,
i
is 1. - On the second iteration,
i
is 2. - On the third iteration,
i
is 3.
- On the first iteration,
- This happens, even though there is no assignment statement
i =
anywhere.
- Finally, we add some tabs and other formatting for a nicer-looking multiplication table!
print("\t 1\t2\t3\t4\t5\t6\t7\t8\t9\t10\t11\t12")
print("_"*100)
for i in np.arange(1, 13):
print(str(i)+"\t|"+"\t".join(multiples(i).astype(str)))
1 2 3 4 5 6 7 8 9 10 11 12 ____________________________________________________________________________________________________ 1 |1 2 3 4 5 6 7 8 9 10 11 12 2 |2 4 6 8 10 12 14 16 18 20 22 24 3 |3 6 9 12 15 18 21 24 27 30 33 36 4 |4 8 12 16 20 24 28 32 36 40 44 48 5 |5 10 15 20 25 30 35 40 45 50 55 60 6 |6 12 18 24 30 36 42 48 54 60 66 72 7 |7 14 21 28 35 42 49 56 63 70 77 84 8 |8 16 24 32 40 48 56 64 72 80 88 96 9 |9 18 27 36 45 54 63 72 81 90 99 108 10 |10 20 30 40 50 60 70 80 90 100 110 120 11 |11 22 33 44 55 66 77 88 99 110 121 132 12 |12 24 36 48 60 72 84 96 108 120 132 144
Ranges¶
- Recall, each element of a list/array has a numerical position.
- The position of the first element is 0, the position of the second element is 1, etc.
- We can write a
for
-loop that accesses each element in an array by using its position. np.arange
will come in handy.
actions = np.array(['ate', 'slept', 'ran'])
feelings = np.array(['content 🙂', 'energized 😃', 'exhausted 😓'])
len(actions)
3
for i in np.arange(len(actions)):
print(i)
0 1 2
for i in np.arange(len(actions)):
print('I', actions[i], 'and I felt', feelings[i])
I ate and I felt content 🙂 I slept and I felt energized 😃 I ran and I felt exhausted 😓
Example: Goldilocks and the Three Bears¶
We don't have to use the loop variable inside the loop!
for i in np.arange(3):
print('🐻')
print('👧🏼')
🐻 🐻 🐻 👧🏼
Randomization and iteration¶
- In the next few lectures, we'll learn how to simulate random events, like flipping a coin.
- Often, we will:
- Run an experiment, e.g. "flip 10 coins."
- Compute some statistic, e.g. "number of heads," and write it down somewhere.
- Repeat steps 1 and 2 many, many times using a
for
-loop.
![No description has been provided for this image](images/append.jpg)
np.append
¶
- This function takes two inputs:
- An array.
- An element to add on to the end of the array.
- It returns a new array. It does not modify the input array.
- We typically use it like this to extend an array by one element:
name_of_array = np.append(name_of_array, element_to_add)
- ⚠️ Remember to store the result!
some_array = np.array([])
np.append(some_array, 'hello')
array(['hello'], dtype='<U32')
some_array
array([], dtype=float64)
# Need to save the new array!
some_array = np.append(some_array, 'hello')
some_array
array(['hello'], dtype='<U32')
some_array = np.append(some_array, 'there')
some_array
array(['hello', 'there'], dtype='<U32')
Example: Coin flipping¶
The function flip(n)
flips n
fair coins and returns the number of heads it saw. (Don't worry about how it works for now.)
def flip(n):
'''Returns the number of heads in n simulated coin flips, using randomness.'''
return np.random.multinomial(n, [0.5, 0.5])[0]
# Run this cell a few times – you'll see different results!
flip(10)
8
Let's repeat the act of flipping 10 coins, 10000 times.
- Each time, we'll use the
flip
function to flip 10 coins and compute the number of heads we saw. - We'll store these numbers in an array,
heads_array
. - Every time we use our
flip
function to flip 10 coins, we'll add an element to the end ofheads_array
.
# heads_array starts empty – before the simulation, we haven't flipped any coins!
heads_array = np.array([])
for i in np.arange(10000):
# Flip 10 coins and count the number of heads.
num_heads = flip(10)
# Add the number of heads seen to heads_array.
heads_array = np.append(heads_array, num_heads)
Now, heads_array
contains 10000 numbers, each corresponding to the number of heads in 10 simulated coin flips.
heads_array
array([4., 6., 5., ..., 4., 5., 4.])
len(heads_array)
10000
(bpd.DataFrame().assign(num_heads=heads_array)
.plot(kind='hist', density=True, bins=np.arange(0, 12), ec='w', legend=False,
title = 'Distribution of the number of heads in 10 coin flips')
);
![No description has been provided for this image](images/accumulate.jpg)
The accumulator pattern¶
- To store our results, we'll typically use an
int
or an array. - If using an
int
, we define anint
variable (usually to0
) before the loop, then use+
to add to it inside the loop.- Think of this like using a tally.
- If using an array, we create an array (usually empty) before the loop, then use
np.append
to add to it inside the loop.- Think of this like writing the results on a piece of paper.
- This pattern – of repeatedly adding to an
int
or an array – is called the accumulator pattern.
for
-loops in DSC 10¶
Almost every
for
-loop in DSC 10 will use the accumulator pattern.Do not use
for
-loops to perform mathematical operations on every element of an array or Series.- Instead use DataFrame manipulations and built-in array or Series methods.
Helpful video 🎥: For Loops (and when not to use them) in DSC 10.
Working with strings¶
String are sequences, so we can iterate over them, too!
for letter in 'uc san diego':
print(letter.upper())
U C S A N D I E G O
'california'.count('a')
2
Example: Vowel count¶
Below, complete the implementation of the function vowel_count
, which returns the number of vowels in the input string s
(including repeats). Example behavior is shown below.
>>> vowel_count('king triton')
3
>>> vowel_count('i go to uc san diego')
8
✅ Click here to see the solution after you've tried it yourself.
def vowel_count(s): # We need to keep track of the number of vowels seen so far. Before we start, we've seen zero vowels. number = 0 # For each of the 5 vowels: for vowel in 'aeiou': # Count the number of occurrences of this vowel in s. num_vowel = s.count(vowel) # Add this count to the variable number. number = number + num_vowel # Once we've gotten through all 5 vowels, return the answer. return number
def vowel_count(s):
# We need to keep track of the number of vowels seen so far. Before we start, we've seen zero vowels.
number = 0
# For each of the 5 vowels:
# Count the number of occurrences of this vowel in s.
# Add this count to the variable number.
# Once we've gotten through all 5 vowels, return the answer.
vowel_count('king triton')
vowel_count('i go to uc san diego')
Summary, next time¶
Summary¶
if
-statements allow us to run pieces of code depending on whether certain conditions areTrue
.for
-loops are used to repeat the execution of code for every element of a sequence.- Lists, arrays, and strings are examples of sequences.
Next time¶
- Probability.
- A math lesson – no code!